Open dechamps opened 1 year ago
Actually, closer investigation of the recordings reveal that there is only a single glitch in each stream, happening exactly 20 ms after the start of the test signal. After that single glitch, a 10-second stream appears to be glitch-free:
This likely explains why no-one is complaining about this - a single glitch at the beginning of the stream is arguably benign, although this is still a bug that should be fixed (if only so that we can get proper results from paloopback
).
I was wondering if this could be related to the use of DirectSoundFullDuplexCreate8
, but this doesn't seem to be the case: the behaviour is exactly the same if I undef PAWIN_USE_DIRECTSOUNDFULLDUPLEXCREATE
.
If I record the audio going through the loopback device using Audacity at the same time as the paloopback test is running, the glitch appears in the separately recorded audio. This indicates that the glitch is produced in the PortAudio → device (playback) direction.
Hi Etienne, please update your issue report to specify Windows version and hardware setup.
Can you consistently reproduce this on multiple systems with differing audio hardware?
I have never observed such behavior myself, nor do I recall anyone report in over the long history of PortAudio (DirectSound was one of the first host API implementations). This makes me think that it is possibly a regression in recent Windows versions, or with specific drivers.
please update your issue report to specify Windows version and hardware setup.
I already did, see bottom of my first post.
Can you consistently reproduce this on multiple systems with differing audio hardware?
I only tried Virtual Audio Cable so far. I guess I could try other VAC KS modes or some real hardware. The fact that both MME and DS half duplex are working fine would seem to point away from the hardware, though.
I have never observed such behavior myself, nor do I recall anyone report in over the long history of PortAudio
It's possible it was always there but that no-one noticed it - a glitch happening in the first 20 ms could be quite hard to notice in real usage, especially given that the first 20 ms of streamed audio would probably be silence in many scenarios. Also an half-duplex application would not trigger it. I only noticed this problem when I saw paloopback
fail.
I'm not saying it's not a bug, just that from my point of view PortAudio is not the most likely culprit.
Please at least test with real hardware.
I'm still investigating but so far I am getting very suspicious there is a problem in the interaction between the DS host API code and the AdaptingProcess()
buffer processor code.
The return value of PaUtil_EndBufferProcessing()
is somewhat vaguely documented as the "number of frames processed". The DS host code implicitly assumes that this is the number of frames that were read from the host input buffer, and also the number of frames that were written to the host output buffer. The DS host code increments the input and output buffer offsets accordingly, so that on the next TimeSlice()
call we will process the next portions of both buffers.
The buffer processor business logic that is used here, specifically AdaptingProcess()
, ultimately returns as the "number of frames processed" the number of input frames that were consumed. With regard to the DS caller code, this works as long as the number of output frames that were written is always exactly equal to the number of input frames read.
The problem is, I believe I have found a scenario in which that isn't the case. Here's how things actually unfold in reality, assuming framesPerBuffer
1024 and 48 kHz sample rate:
framesInTempOutputBuffer
to 1024: https://github.com/PortAudio/portaudio/blob/d07842c1b021a9a05d50da88272e3f0930ecfbe3/src/common/pa_process.c#L182-L186CopyTempOutputBuffersToHostOutputBuffers()
the buffer processor transfers 960 frames of silence to the host output buffer. framesInTempOutputBuffer
is now 1024-960=64.AdaptingProcess()
does not read any input frames and does not call the stream callback because there are not enough frames available (960 < 1024).AdaptingProcess()
returns 0 "frames processed", since it did not process any input frames. (It did write some silence into the output host buffer without telling the host code, but that's benign - it just overwrote silence with silence at this point.)PaUtil_EndBufferProcessing()
returned 0, the DS code does not move the buffer offsets and everything stays as it is.CopyTempOutputBuffersToHostOutputBuffers()
the buffer processor transfers the remaining 64 frames of silence to the beginning of the host output buffer (remember the DS buffer offsets haven't moved), overwriting some of the silence it already wrote in step 5. (This is starting to feel wrong, and things only go downhill from there.) framesInTempOutputBuffer
is now 0, and hostOutputChannels[i].data
pointers are incremented by 64 frames. AdaptingProcess()
sees that more than 1024 input frames are available. It calls the stream callback on the first 1024 input frames. framesInTempOutputBuffer
is now 1024.AdaptingProcess()
calls CopyTempOutputBuffersToHostOutputBuffers()
again. CopyTempOutputBuffersToHostOutputBuffers()
copies the 1024 frames to the host output buffer after the 64 frames of silence it wrote in step 11 (this is because in step 11 the internal buffer processor hostOutputChannels[i].data
pointers were incremented).AdaptingProcess()
returns the number of input frames processed, which is 1024.I believe this explains the symptoms. I believe the reason why this only happens at the end of the first user buffer likely has to do with the fact that this initial priming phase where framesInTempOutputBuffer
is initialized to 1024 in step 1 is (presumably) the only scenario where output frames are written without any input frames being read. Because this has to do with initialization/priming, this would explain why there is only a single glitch and the rest of the stream is fine.
Next step is to come up with a fix. For that I need to understand how that code was originally intended to work. This buffer adaptation stuff is proving to be quite headache-inducing!
The DirectSound Host API is glitchy when used in full duplex mode (i.e. when both input and output devices are used in a single PortAudio stream) when using reasonable parameters (default frames per buffer, default high suggested latency).
Half duplex is fine.
Example output from
paloopback -r48000 -s1024 --inputLatency 240 --outputLatency 240 -w
:A quick look at the recorded test signal confirms the issue:
MME passes the same test with flying colors.