quiniouben / vban

VBAN protocol open-source implementation
GNU General Public License v3.0
450 stars 63 forks source link

jack backend: 16 bit reception works, 24 bit does not #18

Closed ochilan closed 6 years ago

ochilan commented 6 years ago

Hello,

first off, thank you very much for this tool, it works quite nicely. However, I'm having one issue with sending audio from Windows 10 using Voicemeeter Banana 2.0.4.0 to an Arch Linux machine in 24 bit mode. In the VBAN settings of Banana, I have the option to send audio in 16 bit or 24 bit. When using 16 bit everything works well and I am able to record audio on my Linux box using the Jack backend with REAPER as the DAW. However, when I switch to 24 bit in Banana, I get a very loud signal in REAPER with the waveform strictly above the center line of the audio track. It seems like the audio data is misinterpreted in some way. Let me know if I can help you with reproducing this issue or if you have any idea what I could try.

Best regards Ochi

quiniouben commented 6 years ago

Hi, thank you for reporting. To be honest, jack backend is not the most tested part of this project, and now that I quickly read the code, I suspect that it only works correctly in 16 bit int or in 32 bit float format. If you are able to do it, you could try to apply the following patch to the code and then recompile:


diff --git a/src/common/backend/jack_backend.c b/src/common/backend/jack_backend.c
index 0531ca9..12affae 100644
--- a/src/common/backend/jack_backend.c
+++ b/src/common/backend/jack_backend.c
@@ -323,17 +323,17 @@ inline jack_default_audio_sample_t jack_convert_sample(char const* ptr, enum VBa
     switch (bit_fmt)
     {
         case VBAN_BITFMT_8_INT:
-            return (float)*((int8_t const*)ptr);
+            return (float)*((int8_t const*)ptr) / (2^7);

         case VBAN_BITFMT_16_INT:
-            return (float)(*((int16_t const*)ptr)) / 32768;
+            return (float)(*((int16_t const*)ptr)) / (2^15);

         case VBAN_BITFMT_24_INT:
             memcpy(&value, ptr, 3);
-            return (float)value;
+            return (float)value / (2^23);

         case VBAN_BITFMT_32_INT:
-            return (float)*((int32_t const*)ptr);
+            return (float)*((int32_t const*)ptr) / (2^31);

         case VBAN_BITFMT_32_FLOAT:
             return *(float const*)ptr;

Let me know if you could achieve something. On my side, I will try to test this asap, but it might be in a few days.

ochilan commented 6 years ago

Thank you very much for the quick response. :) I will try the change when I'm back home, possibly tomorrow.

ochilan commented 6 years ago

Hello,

the changes didn't do the trick for me. The "x^y" syntax is definitely wrong, that's a xor in C :) But even when writing out the expressions, the 24 bit case doesn't work. I tried writing the bytes in "ptr" in other orders into the "value" integer but to no avail. It's not clear to me how the 24 bit PCM is encoded. Considering [1] it could be more complicated than simply interpreting the bytes as bytes of an integer (but I don't know if this is also the case here). Even if they were, wouldn't memcpying the bytes into the integer fill it incorrectly? I guess "value" is a little endian integer on intel and the memcpy would fill the leftmost three bytes which would not result in the correct numbers.

So yeah, it would be interesting to know how the 24-bit PCM encoding works in this case.

[1] https://wiki.multimedia.cx/index.php/PCM#24-Bit_PCM

quiniouben commented 6 years ago

The "x^y" syntax is definitely wrong, that's a xor in C :)

oh god, I should never do such things on weekend... sure it doesn't work. Mmh, the 24 bit description in the link you sent is only for DVD. 24 bit PCM is like others. Anyway, there is something wrong in my implementation, I guess the sign is lost or something like this.

quiniouben commented 6 years ago

Ok, this should work with version 2.0.4.

ochilan commented 6 years ago

The 24 bit resolution seems to work fine now, thank you.

However, when analyzing the results closer, I noticed something else. I played back a 2000 Hz sine on the sending machine and recorded it on the receiving machine. When recording directly using a Focusrite audio interface, the sine looks okay. When recording using VBAN, there are hiccups in a certain interval. Please see the attached screenshots. The upper one is recorded directly, the lower one is recorded via VBAN.

I don't think that the change has something to do with this. This also happens using 16 bit resolution. And of course there are many other aspects that could go wrong here, including voicemeeter banana itself or some settings involved somewhere in my setup. It doesn't have to be a problem with the receptor.

The point is that I don't have the time to debug this further now, but it hinders me from using VBAN for my usecase either. I just wanted to let you know about the problem I'm observing, maybe you want to try this experiment yourself at some point.

Thank you for the quick response regarding the original problem anyway.

overview

closeup

quiniouben commented 6 years ago

VBAN is a synchronization-less UDP protocol. This has 2 major implications: 1- you might loose packets in the network between emitter and receptor ; 2- your emitter and receptor probably run with slightly different clocks and the protocol cannot do anything to help compensating this difference, resulting in possible buffer overflows / underflows on receptor side from time to time.

On a high level point of view, I would say that if you need sample accurate transmission between two partners and low latency is not the main criterium, VBAN may not be the best choice. On an implementation detail point of view, increasing the buffer size (through the --quality parameter) may improve the behavior, eventhough I know the jack backend is not as good as the others on this. I may spend time to improve jack backend anyway. Sorry for the possibly disappointing answer.

quiniouben commented 6 years ago

Hi, I just fixed the jack backend so that it is more reliable in latency: 1- vban_receptor internal buffer is set to 2 times the buffer size computed from --quality factor (or 2 times the jack buffer size if this happens to be bigger) ; 2- the playback position is set to 1 time the buffer size;

The implication are: 1- latency is more reliable 2- you can really experiment with --quality parameter and see if your problems are improved

Due to the title of this issue, I will consider marking it closed soon, as the original problem is fixed.