godotengine / godot-proposals

Godot Improvement Proposals (GIPs)
MIT License
1.14k stars 93 forks source link

Add a `buffered` signal to AudioEffectCapture #2650

Open paulicka opened 3 years ago

paulicka commented 3 years ago

Was suggested on Discord that I contact @lyuma (by @Calinou)

Describe the project you are working on

Markdown To Story Script Creation with Script Line Recording

Describe the problem or limitation you are having in your project

Not a "problem" per se, but an inefficiency... Why do I poll the AudioEffectCapture? If there are multiple consumers, should they each be polling? The documentation is a bit unclear on ring buffer management... Am I consuming buffers when I call get_buffer?
Do I share this with others, or hoard it all to myself? Why do I have to do all this?!?!?

Describe the feature / enhancement and how it helps to overcome the problem or limitation

Add a signal buffered(buffer) # where buffer is PoolVector2Array

Then everyone who wishes to consume the next buffer will be signaled, whether for recording to a wav file or broadcasting over the network or transcoding into an image or...whatever.

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

func _ready():
  var capture = get_your_audio_effect_capture_however_you_do_now()
  capture.connect("buffered", self, "_on_Capture_buffered")

# buffer can be shared amongst all listeners...much joy heard throughout the land!
func _on_Capture_buffered(buffer : PoolVector2Array) -> void:
  use_the_buffer_however_you_do_now(buffer)

as compared to

var _capture : AudioEffectCapture

func _ready():
  _capture = get_your_capture_however_you_do_now()

# Polling...*sigh*
func _process():
   var frame_count = _capture.get_frames_available()
  if frame_count:
  # Voice of Gollum:  My buffer...all mines!
    use_the_buffer_however_you_do_now(_capture.get_buffer(frame_count))
  # ...buffer now consumed and gone for anyone else...

If this enhancement will not be used often, can it be worked around with a few lines of script?

Sure. Everyone can poll all the time...whee! And create multiple AudioEffectCapture effects if you want multiple users/consumers...

Is there a reason why this should be core and not an add-on in the asset library?

Convenience, efficiency, removal of redundancy. But sure, I can make an addon that becomes a Singleton that does this...totally doable.

lyuma commented 3 years ago

Hi, As the creator of the AudioEffectCapture implementation, and as someone with some experience making production apps with audio, I'd like to offer my two cents. I apologize for the wordy nature of my replies. It's hard for me to write concisely.

(In regards to your comment about multiple consumers, note that it is not recommended to have multiple consumers from a single AudioEffectCapture. It is not be designed for this, so it could lead to reading corrupt data. RingBuffer implements a single-consumer, single-producer lock-free ring buffer. This is a widely used standard data structure but it comes with this limitation.)

Regarding your questions about sharing buffers, no they are not shared: buffers can be assumed to be allocated on each use and deallocated when you return from the function or remove your reference to them. Internally, buffers are pooled and reused as needed, so this is quite efficient and not something you should be concerning yourself with or trying to optimize. Let's ignore your buffer concerns for the rest of this response as they are not relevant.

So as far as the premise of your question, I think you have identified a problem with the way AudioEffectCapture is often used, polling from the main thread, and you are hinting at a solution which mimics modern audio APIs insofar as a callback occurs at the instant data is generated or requested. So at first glance, it seems good.

There are three main issues here:

  1. First with your premise, you should be doing audio operations from a separate thread, not from the main thread or _process. You do not want your audio processing tied to your framerate, and you do not want your audio to become choppy if your main thread has to do work, for example loading levels.

  2. Signals can only be sent to the current thread (default) or to the main thread (CONNECT_DEFERRED). The current thread is not allowed because we do not want user-written code running on the AudioServer (audio effect bus). The main thread is not good because of point 1. So we cannot use signals. But we can use a Semaphore, so let's say we use a Semaphore instead!

  3. You must not ever lock the audio server. There is a Golden Rule about audio processing and avoiding syscalls and locking primitives. This would also rule out signals as sending a signal can require allocation and locking for the shared queue. What about Semaphore?

Does posting a Semaphore count as a system call and/or locking? I don't actually know! is it Operating System specific? Some OS Scheduler algorithms may halt execution of the current thread and transfer it to the thread that was awoken by the system call (e.g. if you post() a semaphore, the CPU will context switch to the user thread and wake it up at the wait() call, leaving the audio server asleep until the CPU has available time. This would be a problem.

I think it might be safe to do the above semaphore operation once the audio server has finished its work for a given tick and is about to go to sleep.

So here's my counter-proposal (whether this is possible depends on the locking mechanics of semaphores):

  1. Users create a semaphore and register it with the AudioEffectCapture (e.g. set_semaphore() )
  2. Users should be using a thread for their audio processing. The thread waits on the semaphore, and each time the semaphore wakes up, the thread attempts to read the correct amount of data from the ring buffer, and add it to a queue or whatever.
  3. If a semaphore is set, each tick, the AudioEffectCapture adds its semaphore to a list of semaphores to post.
  4. At the end of a AudioServer tick, it loops through all semaphores in the list and calls post() on each one. We must ensure that this can be done without acquiring a lock, but a syscall may be acceptable.

We should also implement the same API for use in AudioStreamGenerator.

paulicka commented 3 years ago

Wow. What a wonderful counter-suggestion, and a really great read all-in-all. Thanks for the quick feedback! I didn't know signals used a shared queue. And as to the processing in a separate thread, excellent point. (My use case is within an EditorPlugin, so less of a performance concern, but I really appreciate revisiting the multithreading performance mind-set.)

The Semaphore idea sounds like a good hybrid solution. Anything I can do to help?

lyuma commented 3 years ago

I probably won't get to working on this for a long time - if you're interested in real-time audio and want to prototype something with threading and a Semaphore, that would help to validate this approach.

paulicka commented 3 years ago

Would it work if I did the prototype in GDScript? I am not familiar with C++ in Godot (though want to get there eventually.)

lyuma commented 3 years ago

My personal opinion is it's good to prototype in GDScript and use C++ only for the critical optimizations.

However... there is one small C++ change that would be needed: part of the proposal involves modifying the AudioServer to post a semaphore. That small part would need to be done in pure C++ because GDScript cannot run on the audio bus thread. (especially due to the requirement of not locking)

GeorgeS2019 commented 2 years ago

get_your_capture_however_you_do_now get_your_audio_effect_capture_however_you_do_now()

For users here who are curious and interested to know how that is done => read this.