dagargo / overwitch

JACK client for Overbridge devices
GNU General Public License v3.0
127 stars 15 forks source link

Patchset to optimize audio latency #9

Closed lentferj closed 2 years ago

lentferj commented 2 years ago

With this patchset I am now able to run without any audio crackling with 32 or even 24 blocks per transfer. I identified the main reason for the problems I had before to be reading of incoming MIDI, so cf3a4310424 probably hast most effect. I broke down the changes in little commits as far as possible so you can review and decide for your self what is good and effective. I am not sure about either ;)

dagargo commented 2 years ago

That's a lot to unpack.

First of all a couple of questions.

Here is my top output with shift+H. There are a lot RT threads and all of the audio applications are running with negative priorities (RT). ArdourGUI does not need RT priorities. Today, Overwitch has been running for several hours without problems. At some point got broken but haven't found the problem yet. Sometimes gets broken much sooner.

Threads: 1334 total,   2 running, 1332 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1,8 us,  0,7 sy,  0,0 ni, 97,4 id,  0,0 wa,  0,0 hi,  0,0 si,  0,0 st
MiB Mem :  16003,6 total,   2286,3 free,   3770,7 used,   9946,5 buff/cache
MiB Swap:  16319,0 total,  16317,8 free,      1,2 used.  11722,8 avail Mem 

    PID USER     PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                           
   1492 user     20   0 3245408  96992  60884 R   8,9   0,6 173:02.50 Xwayland                                                          
 238516 user     -6   0  336684 109940 108860 S   4,0   0,7  11:31.33 overwitch                                                         
    304 root    -88   0       0      0      0 S   3,0   0,0  64:16.95 irq/56-xhci_hcd                                                   
 215112 user     20   0 3410976 547100 193152 S   3,0   3,3  14:11.82 ArdourGUI                                                         
 215330 user     -6   0 3410976 547100 193152 S   3,0   3,3  12:47.25 audioengine                                                       
 215359 user     -6   0 3410976 547100 193152 S   2,3   3,3   9:25.70 RT-1-(nil)                                                        
 215360 user     -6   0 3410976 547100 193152 S   2,3   3,3   9:25.46 RT-2-(nil)                                                        
 215361 user     -6   0 3410976 547100 193152 S   2,3   3,3   9:24.57 RT-3-(nil)                                                        
 215363 user     -6   0 3410976 547100 193152 S   2,3   3,3   9:24.46 RT-5-(nil)                                                        
 215364 user     -6   0 3410976 547100 193152 S   2,3   3,3   9:24.13 RT-6-(nil)                                                        
 215358 user     -6   0 3410976 547100 193152 S   2,0   3,3   9:24.75 RT-main-(nil)                                                     
 215362 user     -6   0 3410976 547100 193152 S   2,0   3,3   9:23.26 RT-4-(nil)                                                        
 238515 user     -6   0  336684 109940 108860 S   2,0   0,7   4:47.11 overwitch                                                         
   1782 user     20   0 3245408  96992  60884 S   1,7   0,6  33:57.07 Xwayland:rcs0                                                     
   1353 user     20   0 6007236 291540 117276 S   1,0   1,8  53:43.73 gnome-shell                                                       
 215099 user    -11   0  289972 133704 113388 S   1,0   0,8   5:57.70 jackd                                                             
 215107 user     -6   0  274572 111896 110764 S   0,7   0,7   2:06.00 a2jmidid                                                          
 238514 user     -6   0  336684 109940 108860 S   0,7   0,7   1:09.60 overwitch                                                         
 274260 user     20   0   16420   5856   3740 R   0,7   0,0   0:00.08 top                                                               
     22 root     20   0       0      0      0 S   0,3   0,0   1:46.88 ksoftirqd/1                                                       
     34 root     20   0       0      0      0 S   0,3   0,0   1:49.03 ksoftirqd/3                                                       
     51 root     -2   0       0      0      0 S   0,3   0,0   0:52.77 rcuc/6                                                            
   1431 user     20   0 6007236 291540 117276 S   0,3   1,8   1:32.77 gnome-she:rcs0                                                    
   2395 user     20   0 4275620 581496 196464 S   0,3   3,5  35:29.42 firefox-esr                                                       
   2533 user     20   0 2977116 277632 153440 S   0,3   1,7   6:03.84 Web Content                                                       
   3399 user     20   0  453944  91048  44624 S   0,3   0,6  14:04.07 gnome-terminal-                                                   
  79918 user     20   0 4821532 253516 109876 S   0,3   1,5   8:09.24 atom                                                              
 271131 user     20   0 2545328 176492 125620 S   0,3   1,1   0:03.45 Web Content                           
 [...]

I'm on Debian stable with the included RT kernel. Ryzen 1700, 16 GiB.

$ uname -v
#1 SMP PREEMPT_RT Debian 5.10.46-5 (2021-09-23)
lentferj commented 2 years ago

I run a 12-core XEON box (a bit older) with HT disabled via Bios.

processor   : 11
vendor_id   : GenuineIntel
cpu family  : 6
model       : 44
model name  : Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz
stepping    : 2
microcode   : 0x1f
cpu MHz     : 2927.000
cache size  : 12288 KB
physical id : 1
siblings    : 6
core id     : 10
cpu cores   : 6
apicid      : 52
initial apicid  : 52
fpu     : yes
fpu_exception   : yes
cpuid level : 11
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat flush_l1d
vmx flags   : vnmi preemption_timer invvpid ept_x_only ept_1gb flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest ple
bugs        : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips    : 5866.48
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

I am on Debian Buster.

$ dpkg -l | grep libusb
ii  libgusb2:amd64                                              0.3.0-1                                     amd64        GLib wrapper around libusb1
ii  libusb-0.1-4:amd64                                          2:0.1.12-32                                 amd64        userspace USB programming library
ii  libusb-1.0-0:amd64                                          2:1.0.22-2                                  amd64        userspace USB programming library
ii  libusb-1.0-0:i386                                           2:1.0.22-2                                  i386         userspace USB programming library
ii  libusb-1.0-0-dev:amd64                                      2:1.0.22-2                                  amd64        userspace USB programming library development files
ii  libusb-1.0-doc                                              2:1.0.22-2                                  all          documentation for userspace USB programming
ii  libusbmuxd4:amd64                                           1.1.0~git20181007.07a493a-1                 amd64        USB multiplexor daemon for iPhone and iPod Touch devices - library
ii  libusbredirhost1:amd64                                      0.8.0-1                                     amd64        Implementing the usb-host (*) side of a usbredir connection (runtime)
ii  libusbredirparser1:amd64                                    0.8.0-1                                     amd64        Parser for the usbredir protocol (runtime)

I am currently running 5.14.0-15.1-liquorix-amd64 #1 ZEN SMP PREEMPT liquorix 5.14-17.1~buster (2021-10-27) x86_64 GNU/Linux I can try to go back to a standard RT Kernel. But 5.14 supposedly has better USB audio performance.

I can run Bitwig with a couple of heavy u-he synth with no glitches or other problems.

I figured that is it is MIDI IN part when I commented just this line https://github.com/dagargo/overwitch/blob/master/src/overbridge.c#L707

dagargo commented 2 years ago

BTW, the architecture of the project is as follows.

There are 2 audio threads, one thread to process the OB side, which writes to the o2j ring buffer and reads from the j2o rind buffer; and other that process the JACK side, which reads from the o2j ring buffer and writes to the j2o ring buffer.

The JACK thread is created when the client starts while the other one is created manually. Both reading ends start to read before the other ends start to write to their respective queues. Then, the writers start to write and only when the readers have enough frames they start copying data to the JACK buffer or USB OB blocks.

The reason there is a single thread for both MIDI and OB incoming packets is that having another thread will force us to use some synchronization mechanism and also having two threads polling on the same libusb_handle_events_timeout file descriptors makes this even more complicated. See this for more information on multithreading with libusb. But we certainly need a thread for sending MIDI data to the device and then we need another couple of ring buffers only for MIDI o2j and j2o.

As a side note, ss libusb transfer callbacks are processed in the same thread only the DLL related code and the status need synchronized access.

lentferj commented 2 years ago

so I switched to 5.10.0-0.bpo.9-rt-amd64 #1 SMP PREEMPT_RT Debian 5.10.70-1~bpo10+1 now. But I do get

ERROR:jclient.c:232:(jclient_j2o): j2o: Audio ring buffer overflow. Discarding data...
ERROR:jclient.c:232:(jclient_j2o): j2o: Audio ring buffer overflow. Discarding data...

almost immediately.. and quite a few of glitches with current master.

dagargo commented 2 years ago

I figured that is it is MIDI IN part when I commented just this line https://github.com/dagargo/overwitch/blob/master/src/overbridge.c#L707

This is really interesting. But this does not happen to me.

Perhaps we could have a "no midi" option just in case someone wants to not use this but that's another story.

Could you try in another computer or with a different kernel or something?

lentferj commented 2 years ago

.. while my own branch's playback is fine. Even start/stop from my MPC One works....

dagargo commented 2 years ago

Are you running this on a Digitakt, right?

Could you try to increase the ring buffer lengths with something like this?

It's a lot but...

+#define BUFFER_LEN_MULTIPLIER 16
+
 static const struct overbridge_device_desc DIGITAKT_DESC = {
   .pid = DTAKT_PID,
   .name = "Digitakt",
@@ -726,9 +728,9 @@ overbridge_run (struct overbridge *ob, jack_client_t * client, int priority)
   max_bufsize =
     ob->frames_per_transfer >
     ob->jbufsize ? ob->frames_per_transfer : ob->jbufsize;
-  ob->j2o_rb = jack_ringbuffer_create (max_bufsize * ob->j2o_frame_bytes * 8);
+  ob->j2o_rb = jack_ringbuffer_create (max_bufsize * ob->j2o_frame_bytes * BUFFER_LEN_MULTIPLIER);
   jack_ringbuffer_mlock (ob->j2o_rb);
-  ob->o2j_rb = jack_ringbuffer_create (max_bufsize * ob->o2j_frame_bytes * 8);
+  ob->o2j_rb = jack_ringbuffer_create (max_bufsize * ob->o2j_frame_bytes * BUFFER_LEN_MULTIPLIER);
   jack_ringbuffer_mlock (ob->o2j_rb);

   ob->jclient = client;
@@ -826,8 +828,8 @@ overbridge_init (struct overbridge *ob, char *device_name,
       ob->o2j_midi_data = malloc (USB_BULK_MIDI_SIZE);
       memset (ob->j2o_midi_data, 0, USB_BULK_MIDI_SIZE);
       memset (ob->o2j_midi_data, 0, USB_BULK_MIDI_SIZE);
-      ob->j2o_rb_midi = jack_ringbuffer_create (MIDI_BUF_SIZE * 8);
-      ob->o2j_rb_midi = jack_ringbuffer_create (MIDI_BUF_SIZE * 8);
+      ob->j2o_rb_midi = jack_ringbuffer_create (MIDI_BUF_SIZE * BUFFER_LEN_MULTIPLIER);
+      ob->o2j_rb_midi = jack_ringbuffer_create (MIDI_BUF_SIZE * BUFFER_LEN_MULTIPLIER);
       jack_ringbuffer_mlock (ob->j2o_rb_midi);
       jack_ringbuffer_mlock (ob->o2j_rb_midi);
     }
lentferj commented 2 years ago

I do get

ERROR:overbridge.c:374:(cb_xfr_out): j2o: Error on USB audio transfer: Other error

after some time though. Which doesn't seem to have an effect immediately. After some time my MPC One (Master) and the Digitakt are drifiting apart, though.

lentferj commented 2 years ago

no problem, done. It is close to what I had done at

https://github.com/dagargo/overwitch/pull/9/commits/86c0e7bdbde9e84ab1cd18193062606aa9f026da

I still get the glitches and the ERROR:jclient.c:232:(jclient_j2o): j2o: Audio ring buffer overflow. Discarding data... console output.

dagargo commented 2 years ago

Firmware 1.30? You already tried with a different USB port I think.

I don't see why it doesn't work for you. :cry:

You've made a lot of your changes inside locks. Why? Perhaps I'm missing something. I've checked most of your commits and you're synchronizing the access.

dagargo commented 2 years ago

Regarding 7bf1e76, 640 is really a lot. It's equivalent to 4480 samples or about 0.1 s. Nor sure if we need this much.

9c0aacf it's a very good idea but the problem is that jack_ringbuffer_reset is not thread safe and if we make this call thread safe we'll have to do the same for every other call to the same ring buffer, which would defeat the purpose of using the ring buffers, which are a thread safe structure as long as both read and write operations are always performed in different threads. More info here. However we shouldn't need to reset anything as the JACK thread doesn't queue anything during that cycle.

In cf3a431, what you are doing is to make the whole USB backend to wait. AFAIK, callbacks are running in the same thread. What is really interesting is that you say that this has the most effect. Probably there's something that I don't get here. Perhaps we should sleep for a bit between different transfer submissions.

595fc1c is great. But we do not need to synchronize the access there because the only used variables are the amount of inputs and outputs and those are constants.

With regards to 578c605, it's the same as in cf3a431. We don't need to synchronize so optimization is not needed.

4ec7bc9, same thing regarding synchronization but I really like this. I want to try this commit only. You're using the bulk endpoints instead of the interrupt ones. :+1:

lentferj commented 2 years ago

Oh, forget about the locking. I might have a lucky punch here and there, but this was more like a gut feeling that certain loops, etc should run uninterrupted. But it's not based on understanding or evidence ;). I do think, though, that https://github.com/dagargo/overwitch/pull/9/commits/595fc1c58dc486a97893739e06c5e54b40a9680b does have quite an impact (maybe the locking is unnecessary?). As well maybe as https://github.com/dagargo/overwitch/pull/9/commits/578c605fd8f452ae7032d507e4dcf631140a54b7 (again, the locking maybe unnecessary).

lentferj commented 2 years ago

Thanks for looking at the patches and the detailed explanation. I have done a few things coding here and there (e.g. the Ardour/Mixbus integration for Novation Launchcontrol XL in Ardour / Mixbus (https://github.com/Ardour/ardour/commit/05d35395916f6d00f230ddd64ee0abbea3b5a09c#diff-e80da297bc91625bb397c1dc909b773b17d2f9a73d482de3880ceeaa4b877bbd), but Jack and USB is totally new for me...

lentferj commented 2 years ago

Regarding https://github.com/dagargo/overwitch/commit/cf3a43104242042d842c466660d7b818ca941bde as mentioned, commenting the midi in part out had the immediate effect, that everything was just fine. So I looked at https://github.com/dagargo/overwitch/blob/master/src/overbridge.c#L676-L686 and thought, what is right there, might be right for the incoming part, too. My patch ofc is just a dirty hack in comparison.

dagargo commented 2 years ago

I do think, though, that 595fc1c does have quite an impact (maybe the locking is unnecessary?). As well maybe as 578c605 (again, the locking maybe unnecessary).

Remove those locks and try again to see if performance improves.

lentferj commented 2 years ago

Audio quality is good without those locks (with -b32). But all other patches still applied, though.

lentferj commented 2 years ago

oh, and yes, I got FW 1.30.

dagargo commented 2 years ago

Would it be possible for you to comment line #673 out and check if the master branch is working for you? Please, use 8 blocks only which is enough in my system and keeps latency low.

https://github.com/dagargo/overwitch/blob/09e6609438ccaf5cbbc9b08546f269b4d91acd11/src/overbridge.c#L669-L674

I fear that all our problems are a synchronization issue. Perhaps it's in the way I'm submitting the MIDI OB USB transfers.

dagargo commented 2 years ago

Also, you are not running anything with chrt any more, right?

lentferj commented 2 years ago

No, direct from src/overwitch -b32 -d Digitakt

lentferj commented 2 years ago

Commenting line 673 didn't do much of a difference for me, especially not at -b8 (digital distortion) and still had the buffer overflow messages coming rapidly. Also with -b32 glitches where still there. I combined it with the BUFFER_LEN_MULTIPLIER patch and set it to 80. It kinda works then with -b32, but still regular glitches.

This is what I am currently at

diff --git a/src/overbridge.c b/src/overbridge.c
index 4cfc967..9ebe4c5 100644
--- a/src/overbridge.c
+++ b/src/overbridge.c
@@ -59,6 +59,8 @@

 #define SAMPLE_TIME_NS (1000000000 / ((int)OB_SAMPLE_RATE))

+# define BUFFER_LEN_MULTIPLIER 80
+
 static const struct overbridge_device_desc DIGITAKT_DESC = {
   .pid = DTAKT_PID,
   .name = "Digitakt",
@@ -670,7 +672,7 @@ run_j2o_midi (void *data)
        {
          debug_print (2, "Event frames: %u; diff: %lu\n", event.frames,
                       diff);
-         prepare_cycle_out_midi (ob);
+         // prepare_cycle_out_midi (ob);
        }

       if (diff)
@@ -726,9 +728,9 @@ overbridge_run (struct overbridge *ob, jack_client_t * client, int priority)
   max_bufsize =
     ob->frames_per_transfer >
     ob->jbufsize ? ob->frames_per_transfer : ob->jbufsize;
-  ob->j2o_rb = jack_ringbuffer_create (max_bufsize * ob->j2o_frame_bytes * 8);
+  ob->j2o_rb = jack_ringbuffer_create (max_bufsize * ob->j2o_frame_bytes * 8 * BUFFER_LEN_MULTIPLIER);
   jack_ringbuffer_mlock (ob->j2o_rb);
-  ob->o2j_rb = jack_ringbuffer_create (max_bufsize * ob->o2j_frame_bytes * 8);
+  ob->o2j_rb = jack_ringbuffer_create (max_bufsize * ob->o2j_frame_bytes * 8 * BUFFER_LEN_MULTIPLIER);
   jack_ringbuffer_mlock (ob->o2j_rb);

   ob->jclient = client;
@@ -826,8 +828,8 @@ overbridge_init (struct overbridge *ob, char *device_name,
       ob->o2j_midi_data = malloc (USB_BULK_MIDI_SIZE);
       memset (ob->j2o_midi_data, 0, USB_BULK_MIDI_SIZE);
       memset (ob->o2j_midi_data, 0, USB_BULK_MIDI_SIZE);
-      ob->j2o_rb_midi = jack_ringbuffer_create (MIDI_BUF_SIZE * 8);
-      ob->o2j_rb_midi = jack_ringbuffer_create (MIDI_BUF_SIZE * 8);
+      ob->j2o_rb_midi = jack_ringbuffer_create (MIDI_BUF_SIZE * 8 * BUFFER_LEN_MULTIPLIER);
+      ob->o2j_rb_midi = jack_ringbuffer_create (MIDI_BUF_SIZE * 8 * BUFFER_LEN_MULTIPLIER);
       jack_ringbuffer_mlock (ob->j2o_rb_midi);
       jack_ringbuffer_mlock (ob->o2j_rb_midi);
     }
lentferj commented 2 years ago

When I add this, I can do -b16 without any glitches. Artifacts then start to happen at -b12.

@@ -704,7 +706,7 @@ run_audio_and_o2j_midi (void *data)

   prepare_cycle_in (ob);
   prepare_cycle_out (ob);
-  prepare_cycle_in_midi (ob);
+  // prepare_cycle_in_midi (ob);
dagargo commented 2 years ago

I've just pushed a few commits.

997db96d81378b3d3550f06994c78ae97ed110a6 prevents overwriting the MIDI data in case it has not been sent down downstream at the start of the next cycle.

I don't think this is related to your issue but it would be nice if you could test it.

Quite unrelated, the fixes I applied to not make Overwitch resilient to xruns kind of work. :tada:

Max. o2j latency: 4.0 ms, max. j2o latency: 3.5 ms; o2j ratio avg.: 0.999880; curr. o2j ratio: 0.999882
Max. o2j latency: 4.0 ms, max. j2o latency: 3.8 ms; o2j ratio avg.: 0.999879; curr. o2j ratio: 0.999889
Max. o2j latency: 4.2 ms, max. j2o latency: 4.0 ms; o2j ratio avg.: 0.999886; curr. o2j ratio: 0.999893
ERROR:jclient.c:70:(jclient_thread_xrun_cb): JACK xrun
Max. o2j latency: 4.2 ms, max. j2o latency: 4.0 ms; o2j ratio avg.: 0.999907; curr. o2j ratio: 0.999911
Max. o2j latency: 4.2 ms, max. j2o latency: 4.0 ms; o2j ratio avg.: 0.999912; curr. o2j ratio: 0.999898
Max. o2j latency: 4.2 ms, max. j2o latency: 4.0 ms; o2j ratio avg.: 0.999897; curr. o2j ratio: 0.999898
dagargo commented 2 years ago

Just some ideas.

298539 user     -6   0  336680 109940 108856 S   4,3   0,7   0:11.16 overwitch                                                         
298538 user     -6   0  336680 109940 108856 S   2,0   0,7   0:04.10 overwitch                                                         
298537 user     -6   0  336680 109940 108856 S   0,3   0,7   0:00.90 overwitch                                                         
298532 user     20   0  336680 109940 108856 S   0,0   0,7   0:00.01 overwitch                                                         
298533 user     20   0  336680 109940 108856 S   0,0   0,7   0:00.00 libusb_event                                                      
298535 user     20   0  336680 109940 108856 S   0,0   0,7   0:00.00 overwitch                                                         
298536 user     20   0  336680 109940 108856 S   0,0   0,7   0:00.00 overwitch                                                         

I'd dare to say that first thread is the JACK end, which resamples the audio; the second is the OB end, which performs all the USB stuff but the j2o MIDI; and the third, which does exactly that. The third one is what is causing trouble in your case.

However, as we are mostly sure that the problem is in the MIDI side, could you try to increase the sleep time here? What happens if you multiply this by 10 or 100?

https://github.com/dagargo/overwitch/blob/997db96d81378b3d3550f06994c78ae97ed110a6/src/overbridge.c#L637

Let's not care about the MIDI accuracy yet.

lentferj commented 2 years ago

so, with these patches applied to 09e6609438ccaf5cbb and all other equipment turned off (except my Scarlett 18i8) I still get the glitches. I will try latest master and the proposed additional patch now.

Commenting line 673 didn't do much of a difference for me, especially not at -b8 (digital distortion) and still had the buffer overflow messages coming rapidly. Also with -b32 glitches where still there. I combined it with the BUFFER_LEN_MULTIPLIER patch and set it to 80. It kinda works then with -b32, but still regular glitches.

This is what I am currently at

diff --git a/src/overbridge.c b/src/overbridge.c
index 4cfc967..9ebe4c5 100644
--- a/src/overbridge.c
+++ b/src/overbridge.c
@@ -59,6 +59,8 @@

 #define SAMPLE_TIME_NS (1000000000 / ((int)OB_SAMPLE_RATE))

+# define BUFFER_LEN_MULTIPLIER 80
+
 static const struct overbridge_device_desc DIGITAKT_DESC = {
   .pid = DTAKT_PID,
   .name = "Digitakt",
@@ -670,7 +672,7 @@ run_j2o_midi (void *data)
        {
          debug_print (2, "Event frames: %u; diff: %lu\n", event.frames,
                       diff);
-         prepare_cycle_out_midi (ob);
+         // prepare_cycle_out_midi (ob);
        }

       if (diff)
@@ -726,9 +728,9 @@ overbridge_run (struct overbridge *ob, jack_client_t * client, int priority)
   max_bufsize =
     ob->frames_per_transfer >
     ob->jbufsize ? ob->frames_per_transfer : ob->jbufsize;
-  ob->j2o_rb = jack_ringbuffer_create (max_bufsize * ob->j2o_frame_bytes * 8);
+  ob->j2o_rb = jack_ringbuffer_create (max_bufsize * ob->j2o_frame_bytes * 8 * BUFFER_LEN_MULTIPLIER);
   jack_ringbuffer_mlock (ob->j2o_rb);
-  ob->o2j_rb = jack_ringbuffer_create (max_bufsize * ob->o2j_frame_bytes * 8);
+  ob->o2j_rb = jack_ringbuffer_create (max_bufsize * ob->o2j_frame_bytes * 8 * BUFFER_LEN_MULTIPLIER);
   jack_ringbuffer_mlock (ob->o2j_rb);

   ob->jclient = client;
@@ -826,8 +828,8 @@ overbridge_init (struct overbridge *ob, char *device_name,
       ob->o2j_midi_data = malloc (USB_BULK_MIDI_SIZE);
       memset (ob->j2o_midi_data, 0, USB_BULK_MIDI_SIZE);
       memset (ob->o2j_midi_data, 0, USB_BULK_MIDI_SIZE);
-      ob->j2o_rb_midi = jack_ringbuffer_create (MIDI_BUF_SIZE * 8);
-      ob->o2j_rb_midi = jack_ringbuffer_create (MIDI_BUF_SIZE * 8);
+      ob->j2o_rb_midi = jack_ringbuffer_create (MIDI_BUF_SIZE * 8 * BUFFER_LEN_MULTIPLIER);
+      ob->o2j_rb_midi = jack_ringbuffer_create (MIDI_BUF_SIZE * 8 * BUFFER_LEN_MULTIPLIER);
       jack_ringbuffer_mlock (ob->j2o_rb_midi);
       jack_ringbuffer_mlock (ob->o2j_rb_midi);
     }
lentferj commented 2 years ago

With current master - no improvement and seeing ERROR:jclient.c:231:(jclient_j2o): j2o: Audio ring buffer overflow. Discarding data...

with this added

@@ -634,7 +634,7 @@ run_j2o_midi (void *data)
   struct timespec req;
   struct ob_midi_event event;
   struct overbridge *ob = data;
-  int sleep_time_ns = SAMPLE_TIME_NS * jack_get_buffer_size (ob->jclient) / 2; //Average wait time
+  int sleep_time_ns = SAMPLE_TIME_NS * 100 * jack_get_buffer_size (ob->jclient) / 2;   //Average wait time

still glitches and still overflows

with this added

@@ -732,9 +732,9 @@ overbridge_run (struct overbridge *ob, jack_client_t * client, int priority)
   max_bufsize =
     ob->frames_per_transfer >
     ob->jbufsize ? ob->frames_per_transfer : ob->jbufsize;
-  ob->j2o_rb = jack_ringbuffer_create (max_bufsize * ob->j2o_frame_bytes * 8);
+  ob->j2o_rb = jack_ringbuffer_create (max_bufsize * ob->j2o_frame_bytes * 8 * 80);
   jack_ringbuffer_mlock (ob->j2o_rb);
-  ob->o2j_rb = jack_ringbuffer_create (max_bufsize * ob->o2j_frame_bytes * 8);
+  ob->o2j_rb = jack_ringbuffer_create (max_bufsize * ob->o2j_frame_bytes * 8 * 80);
   jack_ringbuffer_mlock (ob->o2j_rb);

   ob->jclient = client;

still glitches, but no more buffer overflows.

even if I multiple the sleep time with 10.000 - no difference.

But really the problem seems to be o2j_midi (prepare_cycle_in_midi), not j2o_midi.

lentferj commented 2 years ago

Interestingly, this patch against current master gives me almost glitch free audio at -b32

@@ -477,7 +482,7 @@ prepare_cycle_in_midi (struct overbridge *ob)
 {
   libusb_fill_bulk_transfer (ob->xfr_in_midi, ob->device,
                             MIDI_IN_EP, (void *) ob->o2j_midi_data,
-                            USB_BULK_MIDI_SIZE, cb_xfr_in_midi, ob, 100);
+                            USB_BULK_MIDI_SIZE, cb_xfr_in_midi, ob, 500);

   int err = libusb_submit_transfer (ob->xfr_in_midi);
   if (err)
@@ -492,7 +497,7 @@ prepare_cycle_out_midi (struct overbridge *ob)
 {
   libusb_fill_bulk_transfer (ob->xfr_out_midi, ob->device, MIDI_OUT_EP,
                             (void *) ob->j2o_midi_data,
-                            USB_BULK_MIDI_SIZE, cb_xfr_out_midi, ob, 100);
+                            USB_BULK_MIDI_SIZE, cb_xfr_out_midi, ob, 500);

   int err = libusb_submit_transfer (ob->xfr_out_midi);
   if (err)
dagargo commented 2 years ago

New commit but just to print an error.

But really the problem seems to be o2j_midi (prepare_cycle_in_midi), not j2o_midi.

Let's assume that but I'm totally lost. So I have few more questions.

https://github.com/dagargo/overwitch/blob/1c409d1ab1b93115f81c808270c240d8d0897d9d/src/overbridge.c#L382

lentferj commented 2 years ago

Just did some more experimenting. If I set the timeouts for the bulk transfers to something at 1000 or higher I getty pretty glitch free audio even at -b 16.

Regarding your questions:

Jack: JackClient::ClientNotify ref = 3 name = Digitakt notify = 18
Jack: JackClient::ClientNotify ref = 3 name = Digitakt notify = 18
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:jclient.c:231:(jclient_j2o): j2o: Audio ring buffer overflow. Discarding data...
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:jclient.c:231:(jclient_j2o): j2o: Audio ring buffer overflow. Discarding data...
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:jclient.c:231:(jclient_j2o): j2o: Audio ring buffer overflow. Discarding data...
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:jclient.c:231:(jclient_j2o): j2o: Audio ring buffer overflow. Discarding data...
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:jclient.c:231:(jclient_j2o): j2o: Audio ring buffer overflow. Discarding data...
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:jclient.c:231:(jclient_j2o): j2o: Audio ring buffer overflow. Discarding data...
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
ERROR:jclient.c:231:(jclient_j2o): j2o: Audio ring buffer overflow. Discarding data...
lentferj commented 2 years ago

with just this applied, I still get glitches and overflows

@ -379,7 +379,7 @@ cb_xfr_in_midi (struct libusb_transfer *xfr)
   struct overbridge *ob = xfr->user_data;

   status = overbridge_get_status (ob);
-  if (status < OB_STATUS_RUN)
+  if (1)
dagargo commented 2 years ago

So, just to recap, your setup without Overwitch works well and to use it the only thing that makes it work well is commenting out this line.

https://github.com/dagargo/overwitch/blob/1c409d1ab1b93115f81c808270c240d8d0897d9d/src/overbridge.c#L710

If you comment that and do not send MIDI data to the Digitakt, do you still have glitches? Is audio only working?

One last thing for today, all your problems are USB related. Could it be possible that other USB devices are causing the problems? Check this Ardour problem with WiFi topic.

Could you try connecting the Digitakt to another USB hub? BTW, both my interface and the Digitakt are connected to the same port. But, perhaps there is enough bandwidth in my case. The Hartman device is a Soundcraft signature 12 MTK that acts as a 12 inputs and 14 outputs.

$ lsusb -t -v
[...]
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 480M
    ID 1d6b:0002 Linux Foundation 2.0 root hub
    |__ Port 1: Dev 24, If 4, Class=Vendor Specific Class, Driver=, 480M
        ID 1935:000c Elektron Music Machines Digitakt Overbridge
    |__ Port 1: Dev 24, If 2, Class=Vendor Specific Class, Driver=, 480M
        ID 1935:000c Elektron Music Machines Digitakt Overbridge
    |__ Port 1: Dev 24, If 0, Class=Vendor Specific Class, Driver=, 480M
        ID 1935:000c Elektron Music Machines Digitakt Overbridge
    |__ Port 1: Dev 24, If 3, Class=Vendor Specific Class, Driver=, 480M
        ID 1935:000c Elektron Music Machines Digitakt Overbridge
    |__ Port 1: Dev 24, If 1, Class=Vendor Specific Class, Driver=, 480M
        ID 1935:000c Elektron Music Machines Digitakt Overbridge
    |__ Port 2: Dev 23, If 3, Class=Application Specific Interface, Driver=, 480M
        ID 05fc:0021 Harman 
    |__ Port 2: Dev 23, If 1, Class=Audio, Driver=snd-usb-audio, 480M
        ID 05fc:0021 Harman 
    |__ Port 2: Dev 23, If 2, Class=Audio, Driver=snd-usb-audio, 480M
        ID 05fc:0021 Harman 
    |__ Port 2: Dev 23, If 0, Class=Audio, Driver=snd-usb-audio, 480M
        ID 05fc:0021 Harman 
    |__ Port 4: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M
        ID 05e3:0608 Genesys Logic, Inc. Hub
[...]

And thank you for performing all the tests. :+1:

dagargo commented 2 years ago

I've just reverted the last commit because it was very wrong.

These errors are just timeouts that make sense: you wait for USB input and there is none so timeout!

ERROR:overbridge.c:421:(cb_xfr_in_midi): Error on USB MIDI in transfer: Other error
lentferj commented 2 years ago

I even disconnected my USB hub which I use to connect only mouse and keyboard. The only devices I use are my Scarlett 18i8 and the Digitakt and both are connected directly to the (desktop) PC. I also tried different cables and different USB ports. Still I get the issues :/

/:  Bus 08.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
/:  Bus 07.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
/:  Bus 06.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
/:  Bus 05.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=ehci-pci/8p, 480M
    |__ Port 4: Dev 16, If 0, Class=Vendor Specific Class, Driver=, 480M
    |__ Port 4: Dev 16, If 1, Class=Vendor Specific Class, Driver=usbfs, 480M
    |__ Port 4: Dev 16, If 2, Class=Vendor Specific Class, Driver=usbfs, 480M
    |__ Port 4: Dev 16, If 3, Class=Vendor Specific Class, Driver=usbfs, 480M
    |__ Port 4: Dev 16, If 4, Class=Vendor Specific Class, Driver=, 480M
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=ehci-pci/4p, 480M
    |__ Port 1: Dev 2, If 0, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 1: Dev 2, If 1, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 1: Dev 2, If 2, Class=Video, Driver=uvcvideo, 480M
    |__ Port 1: Dev 2, If 3, Class=Video, Driver=uvcvideo, 480M
    |__ Port 2: Dev 3, If 4, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 2: Dev 3, If 2, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 2: Dev 3, If 0, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 2: Dev 3, If 5, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 2: Dev 3, If 3, Class=Vendor Specific Class, Driver=, 480M
    |__ Port 2: Dev 3, If 1, Class=Audio, Driver=snd-usb-audio, 480M

I do have other MIDI devices (several hardware synth) and I have no problem using those with Bitwig or via my MPC One (where the PC only acts as a MIDI hub, kind of).

lentferj commented 2 years ago

... and yes, if I comment that line, everything is just fine (except not having MIDI :) ).

dagargo commented 2 years ago

New idea.

As the MIDI out from the Digitakt is the problem, what if we reduce the timeout to wait for messages? After all, the host is polling the device for data for up to 100 ms.

https://github.com/dagargo/overwitch/blob/6c898de289ba197997d1914e1f6876967bd3e0cf/src/overbridge.c#L478-L481

Honestly, I was using 100 ms, which is really too much, because it was the value used in some examples. Let's just use 5 ms. The reasons it works for me and worked better for you with a higher timeout are not clear.

For me it makes no difference. Any difference on your end?

lentferj commented 2 years ago

I was playing around with this a little and actually higher values (e.g. 1000) give me better results than lower ones. Values of 5 and 10 ms are actually worse. Except value 0, which is also really good, but I assume that disables timeout altogether (couldn't find it in the documentation). But even with increased audio buffers (*32) and timeout set to 1000 ms I still get ERROR:jclient.c:231:(jclient_j2o): j2o: Audio ring buffer overflow. Discarding data... after some time.

With only this patch I get stable audio down to -b16 without overflows

diff --git a/src/overbridge.c b/src/overbridge.c
index f5c9090..84268b6 100644
--- a/src/overbridge.c
+++ b/src/overbridge.c
@@ -477,7 +477,7 @@ prepare_cycle_in_midi (struct overbridge *ob)
 {
   libusb_fill_bulk_transfer (ob->xfr_in_midi, ob->device,
                             MIDI_IN_EP, (void *) ob->o2j_midi_data,
-                            USB_BULK_MIDI_SIZE, cb_xfr_in_midi, ob, 100);
+                            USB_BULK_MIDI_SIZE, cb_xfr_in_midi, ob, 0);

   int err = libusb_submit_transfer (ob->xfr_in_midi);
   if (err)

I haven't checked the effect on midi timing, yet, but I assume it will be disastrous :)

dagargo commented 2 years ago

I don't know what is going on.

The documentation says that a timeout of 0 means it's infinite.

Is MIDI working?

lentferj commented 2 years ago

Yes, actually with this patchset I am currently running quite stable (and in sync with my MPC One metronome)


diff --git a/src/overbridge.c b/src/overbridge.c
index f5c9090..a206188 100644
--- a/src/overbridge.c
+++ b/src/overbridge.c
@@ -445,7 +445,7 @@ prepare_cycle_out (struct overbridge *ob)
                                  AUDIO_OUT_EP, (void *) ob->usb_data_out,
                                  ob->usb_data_out_blk_len *
                                  ob->blocks_per_transfer, cb_xfr_out, ob,
-                                 100);
+                                 10000);

   int err = libusb_submit_transfer (ob->xfr_out);
   if (err)
@@ -462,7 +462,7 @@ prepare_cycle_in (struct overbridge *ob)
                                  AUDIO_IN_EP, (void *) ob->usb_data_in,
                                  ob->usb_data_in_blk_len *
                                  ob->blocks_per_transfer, cb_xfr_in, ob,
-                                 100);
+                                 0);

   int err = libusb_submit_transfer (ob->xfr_in);
   if (err)
@@ -477,7 +477,7 @@ prepare_cycle_in_midi (struct overbridge *ob)
 {
   libusb_fill_bulk_transfer (ob->xfr_in_midi, ob->device,
                             MIDI_IN_EP, (void *) ob->o2j_midi_data,
-                            USB_BULK_MIDI_SIZE, cb_xfr_in_midi, ob, 100);
+                            USB_BULK_MIDI_SIZE, cb_xfr_in_midi, ob, 0);

   int err = libusb_submit_transfer (ob->xfr_in_midi);
   if (err)
@@ -492,7 +492,7 @@ prepare_cycle_out_midi (struct overbridge *ob)
 {
   libusb_fill_bulk_transfer (ob->xfr_out_midi, ob->device, MIDI_OUT_EP,
                             (void *) ob->j2o_midi_data,
-                            USB_BULK_MIDI_SIZE, cb_xfr_out_midi, ob, 100);
+                            USB_BULK_MIDI_SIZE, cb_xfr_out_midi, ob, 0);

   int err = libusb_submit_transfer (ob->xfr_out_midi);
   if (err)
@@ -732,7 +732,7 @@ overbridge_run (struct overbridge *ob, jack_client_t * client, int priority)
   max_bufsize =
     ob->frames_per_transfer >
     ob->jbufsize ? ob->frames_per_transfer : ob->jbufsize;
-  ob->j2o_rb = jack_ringbuffer_create (max_bufsize * ob->j2o_frame_bytes * 8);
+  ob->j2o_rb = jack_ringbuffer_create (max_bufsize * ob->j2o_frame_bytes * 8 * 128);
   jack_ringbuffer_mlock (ob->j2o_rb);
   ob->o2j_rb = jack_ringbuffer_create (max_bufsize * ob->o2j_frame_bytes * 8);
   jack_ringbuffer_mlock (ob->o2j_rb);

But at some point I will always receive ERROR:jclient.c:231:(jclient_j2o): j2o: Audio ring buffer overflow. Discarding data...

I am currently trying if I can see anything if I run with LIBUSB_DEBUG=4, but well... doesn't tell me much :)

lentferj commented 2 years ago

With the timeout in the midi bulk transfers set to 100 ms I do see quite a lot of these "status=-2". Compare with the lower one, which is successful.

[ 0.170539] [00005e44] libusb: debug [libusb_submit_transfer] transfer 0x55b9d517efb0
[ 0.170544] [00005e44] libusb: debug [submit_bulk_transfer] need 1 urbs for new transfer with length 2112
[ 0.170558] [00005e44] libusb: debug [reap_for_handle] urb type=3 status=-2 transferred=0
[ 0.170565] [00005e44] libusb: debug [handle_bulk_completion] handling completion status -2 of bulk urb 1/1
[ 0.170573] [00005e44] libusb: debug [handle_bulk_completion] abnormal reap: urb status -2
[ 0.170591] [00005e44] libusb: debug [handle_bulk_completion] abnormal reap: last URB handled, reporting
[ 0.170601] [00005e44] libusb: debug [usbi_handle_transfer_cancellation] detected timeout cancellation
[ 0.170607] [00005e44] libusb: debug [arm_timerfd_for_next_timeout] next timeout originally 100ms
[ 0.170616] [00005e44] libusb: debug [usbi_handle_transfer_completion] transfer 0x55b9d5175090 has callback 0x55b9d45e5b30
[ 0.170620] [00005e44] libusb: debug [libusb_submit_transfer] transfer 0x55b9d5175090
[ 0.170625] [00005e44] libusb: debug [submit_bulk_transfer] need 1 urbs for new transfer with length 512
[ 0.170640] [00005e44] libusb: debug [libusb_handle_events_timeout_completed] doing our own event handling
[ 0.170645] [00005e44] libusb: debug [handle_events] poll() 3 fds with timeout in 60000ms
[ 0.173709] [00005e44] libusb: debug [handle_events] poll() returned 1
[ 0.173736] [00005e44] libusb: debug [reap_for_handle] urb type=1 status=0 transferred=2112
[ 0.173746] [00005e44] libusb: debug [handle_bulk_completion] handling completion status 0 of bulk urb 1/1
[ 0.173756] [00005e44] libusb: debug [handle_bulk_completion] last URB in transfer --> complete!
[ 0.173764] [00005e44] libusb: debug [arm_timerfd_for_next_timeout] next timeout originally 100ms
[ 0.173775] [00005e44] libusb: debug [usbi_handle_transfer_completion] transfer 0x55b9d517efb0 has callback 0x55b9d45e5850
lentferj commented 2 years ago

if you run overwitch like that, you can observe it (on my system) LIBUSB_DEBUG=5 src/overwitch -d Digitakt 2>&1 | grep -B2 -A5 status=-2

what I can see is, it's alternating between two distinct bulk transfer ids, which means that it happens on both, incoming and outgoing MIDI.

lentferj commented 2 years ago

Ok, wasn't as obvious as I thought :) I used a little trick to identify the transfer running into timeouts by setting an individual timeout for each transfer (101, 102,...).

[ 7.317043] [000063ed] libusb: debug [usbi_handle_transfer_completion] transfer 0x556b231d72f0 has callback 0x556b21dd06b0
[ 7.317048] [000063ed] libusb: debug [libusb_submit_transfer] transfer 0x556b231d72f0
[ 7.317052] [000063ed] libusb: debug [submit_bulk_transfer] need 1 urbs for new transfer with length 8832
[ 7.317061] [000063ed] libusb: debug [reap_for_handle] urb type=3 status=-2 transferred=0
[ 7.317065] [000063ed] libusb: debug [handle_bulk_completion] handling completion status -2 of bulk urb 1/1
[ 7.317067] [000063ed] libusb: debug [handle_bulk_completion] abnormal reap: urb status -2
[ 7.317068] [000063ed] libusb: debug [handle_bulk_completion] abnormal reap: last URB handled, reporting
[ 7.317071] [000063ed] libusb: debug [usbi_handle_transfer_cancellation] detected timeout cancellation
[ 7.317072] [000063ed] libusb: debug [arm_timerfd_for_next_timeout] next timeout originally 101ms

The all mention the 101 timeout which is "prepare_cycle_out". That would mean for some reason getting USB Audio INTO the DT is problematic?

lentferj commented 2 years ago

when I increase the timeout for prepare_cycle_out to 1001, I suddenly see 102ms in the logs, which is prepare_cycle_in. If I increase that to 1002 I again see 1001ms (prepare_cycle_out) in the logs. Looks like they are competing for resources?

lentferj commented 2 years ago

And if I set them like this (prepare_cycle_in_midi to infinite), the rest to only 10 ms, I do not get ANY status=-2 messages at all. So prepare_cycle_in_midi somehow seems to mess up the whole midi transfer...

After some time prepare_cycle_in popped in ....

diff --git a/src/overbridge.c b/src/overbridge.c
index f5c9090..c3b6a61 100644
--- a/src/overbridge.c
+++ b/src/overbridge.c
@@ -441,11 +441,11 @@ cb_xfr_out_midi (struct libusb_transfer *xfr)
 static void
 prepare_cycle_out (struct overbridge *ob)
 {
-  libusb_fill_interrupt_transfer (ob->xfr_out, ob->device,
-                                 AUDIO_OUT_EP, (void *) ob->usb_data_out,
+  libusb_fill_bulk_stream_transfer (ob->xfr_out, ob->device,
+                                 AUDIO_OUT_EP, 1,(void *) ob->usb_data_out,
                                  ob->usb_data_out_blk_len *
                                  ob->blocks_per_transfer, cb_xfr_out, ob,
-                                 100);
+                                 10);

   int err = libusb_submit_transfer (ob->xfr_out);
   if (err)
@@ -458,11 +458,11 @@ prepare_cycle_out (struct overbridge *ob)
 static void
 prepare_cycle_in (struct overbridge *ob)
 {
-  libusb_fill_interrupt_transfer (ob->xfr_in, ob->device,
-                                 AUDIO_IN_EP, (void *) ob->usb_data_in,
+  libusb_fill_bulk_stream_transfer (ob->xfr_in, ob->device,
+                                 AUDIO_IN_EP, 2, (void *) ob->usb_data_in,
                                  ob->usb_data_in_blk_len *
                                  ob->blocks_per_transfer, cb_xfr_in, ob,
-                                 100);
+                                 10);

   int err = libusb_submit_transfer (ob->xfr_in);
   if (err)
@@ -477,7 +477,7 @@ prepare_cycle_in_midi (struct overbridge *ob)
 {
   libusb_fill_bulk_transfer (ob->xfr_in_midi, ob->device,
                             MIDI_IN_EP, (void *) ob->o2j_midi_data,
-                            USB_BULK_MIDI_SIZE, cb_xfr_in_midi, ob, 100);
+                            USB_BULK_MIDI_SIZE, cb_xfr_in_midi, ob, 0);

   int err = libusb_submit_transfer (ob->xfr_in_midi);
   if (err)
@@ -492,7 +492,7 @@ prepare_cycle_out_midi (struct overbridge *ob)
 {
   libusb_fill_bulk_transfer (ob->xfr_out_midi, ob->device, MIDI_OUT_EP,
                             (void *) ob->j2o_midi_data,
-                            USB_BULK_MIDI_SIZE, cb_xfr_out_midi, ob, 100);
+                            USB_BULK_MIDI_SIZE, cb_xfr_out_midi, ob, 10);

   int err = libusb_submit_transfer (ob->xfr_out_midi);
   if (err)
lentferj commented 2 years ago

I added this dirty little hack..

@@ -477,7 +490,7 @@ prepare_cycle_in_midi (struct overbridge *ob)
 {
   libusb_fill_bulk_transfer (ob->xfr_in_midi, ob->device,
                             MIDI_IN_EP, (void *) ob->o2j_midi_data,
-                            USB_BULK_MIDI_SIZE, cb_xfr_in_midi, ob, 100);
+                            USB_BULK_MIDI_SIZE, cb_xfr_in_midi, ob, cim_to);

   int err = libusb_submit_transfer (ob->xfr_in_midi);
   if (err)
@@ -485,6 +498,14 @@ prepare_cycle_in_midi (struct overbridge *ob)
       error_print ("o2j: Error when submitting USB MIDI transfer: %s\n",
                   libusb_strerror (err));
     }
+  if (ob->xfr_in_midi->status > 0) {
+    printf("cycle_in_midi status:%d, length:%d\n", ob->xfr_in_midi->status, ob->xfr_in_midi->actual_length);
+    if (ob->xfr_in_midi->status == 2) {
+       cim_to += 10;
+       printf("increasing xfr_in_midi time out by 10, now at %d\n", cim_to);
+    }
+  }

and guess what, length stays at 0 all the time, and cim_to keeps increasing and increasing..

lentferj commented 2 years ago

a very dirty hack for self increasing usb timeouts... more meant as a demonstration...

diff --git a/src/overbridge.c b/src/overbridge.c
index f5c9090..3fee7a1 100644
--- a/src/overbridge.c
+++ b/src/overbridge.c
@@ -59,6 +59,10 @@

 #define SAMPLE_TIME_NS (1000000000 / ((int)OB_SAMPLE_RATE))

+int coa_to, cia_to = 10; // cycle_in/out audio timeouts
+int com_to = 50; // cycle_in midi timeouts
+int cim_to = 10;  // cycle_in midi timeouts
+
 static const struct overbridge_device_desc DIGITAKT_DESC = {
   .pid = DTAKT_PID,
   .name = "Digitakt",
@@ -413,6 +417,10 @@ cb_xfr_in_midi (struct libusb_transfer *xfr)
                }
            }
          length += OB_MIDI_EVENT_SIZE;
+
+         //struct timespec st;
+         //st.tv_nsec=100000000;
+         //nanosleep(&st, NULL);
        }
     }
   else
@@ -445,7 +453,7 @@ prepare_cycle_out (struct overbridge *ob)
                                  AUDIO_OUT_EP, (void *) ob->usb_data_out,
                                  ob->usb_data_out_blk_len *
                                  ob->blocks_per_transfer, cb_xfr_out, ob,
-                                 100);
+                                 coa_to);

   int err = libusb_submit_transfer (ob->xfr_out);
   if (err)
@@ -453,6 +461,13 @@ prepare_cycle_out (struct overbridge *ob)
       error_print ("j2o: Error when submitting USB audio transfer: %s\n",
                   libusb_strerror (err));
     }
+  if (ob->xfr_out->status > 0) {
+    printf("cycle_in_midi status:%d, length:%d\n", ob->xfr_out->status, ob->xfr_out->actual_length);
+    if (ob->xfr_out->status == 2) {
+       coa_to += 10;
+       printf("increasing xfr_out timeout by 10, now at %d\n", coa_to);
+    }
+  }
 }

 static void
@@ -462,7 +477,7 @@ prepare_cycle_in (struct overbridge *ob)
                                  AUDIO_IN_EP, (void *) ob->usb_data_in,
                                  ob->usb_data_in_blk_len *
                                  ob->blocks_per_transfer, cb_xfr_in, ob,
-                                 100);
+                                 cia_to);

   int err = libusb_submit_transfer (ob->xfr_in);
   if (err)
@@ -470,6 +485,13 @@ prepare_cycle_in (struct overbridge *ob)
       error_print ("o2j: Error when submitting USB audio in transfer: %s\n",
                   libusb_strerror (err));
     }
+  if (ob->xfr_in->status > 0) {
+    printf("cycle_in status:%d, length:%d\n", ob->xfr_in->status, ob->xfr_in->actual_length);
+    if (ob->xfr_in->status == 2) {
+       cia_to += 10;
+       printf("increasing xfr_in_midi timeout by 10, now at %d\n", cia_to);
+    }
+  }
 }

 static void
@@ -477,7 +499,7 @@ prepare_cycle_in_midi (struct overbridge *ob)
 {
   libusb_fill_bulk_transfer (ob->xfr_in_midi, ob->device,
                             MIDI_IN_EP, (void *) ob->o2j_midi_data,
-                            USB_BULK_MIDI_SIZE, cb_xfr_in_midi, ob, 100);
+                            USB_BULK_MIDI_SIZE, cb_xfr_in_midi, ob, cim_to);

   int err = libusb_submit_transfer (ob->xfr_in_midi);
   if (err)
@@ -485,6 +507,13 @@ prepare_cycle_in_midi (struct overbridge *ob)
       error_print ("o2j: Error when submitting USB MIDI transfer: %s\n",
                   libusb_strerror (err));
     }
+  if (ob->xfr_in_midi->status > 0) {
+    printf("cycle_in_midi status:%d, length:%d\n", ob->xfr_in_midi->status, ob->xfr_in_midi->actual_length);
+    if (ob->xfr_in_midi->status == 2 && cim_to < 5000) {
+       cim_to += 10;
+       printf("increasing xfr_in_midi timeout by 10, now at %d\n", cim_to);
+    }
+  }
 }

 static void
@@ -492,7 +521,7 @@ prepare_cycle_out_midi (struct overbridge *ob)
 {
   libusb_fill_bulk_transfer (ob->xfr_out_midi, ob->device, MIDI_OUT_EP,
                             (void *) ob->j2o_midi_data,
-                            USB_BULK_MIDI_SIZE, cb_xfr_out_midi, ob, 100);
+                            USB_BULK_MIDI_SIZE, cb_xfr_out_midi, ob, com_to);

   int err = libusb_submit_transfer (ob->xfr_out_midi);
   if (err)
@@ -500,6 +529,13 @@ prepare_cycle_out_midi (struct overbridge *ob)
       error_print ("j2o: Error when submitting USB MIDI transfer: %s\n",
                   libusb_strerror (err));
     }
+  if (ob->xfr_out_midi->status > 0) {
+    printf("cycle_out_midi status:%d, length:%d\n", ob->xfr_out_midi->status, ob->xfr_out_midi->actual_length);
+    if (ob->xfr_out_midi->status == 2 && com_to < 1000) {
+       com_to += 10;
+       printf("increasing xfr_out_midi timeout by 10, now at %d\n", com_to);
+    }
+  }
 }

 // initialization taken from sniffed session

on my system xfr_in_midi timeout keeps increasing steadily...

dagargo commented 2 years ago

Looks like they are competing for resources?

They are actually competing for the USB bus.

So prepare_cycle_in_midi somehow seems to mess up the whole midi transfer...

This is exactly what I think is happening. But why? And why it doesn't happen to me?

dagargo commented 2 years ago

This is important

The documentation says that a timeout of 0 means it's infinite.

I don't know where I read in the documentation what I said but it is NOT true. As you said I can not find this but I guess 0 means 0 (timeout just after the first time polling the endpoint for data when there is no data).

Sorry for any misunderstanding I may have caused.

This changes things a bit.

I've tried to follow you but there are a lot of tests and I've been doing others on my own so I'm quite lost but I have something I want you to try in case you haven't. What if we set all the timeouts to 0?

After all, that would mean that if there is no data at the endpoint there will be a new timeout and a new transfer will be submitted. I guess the timeouts are used to not flood the bus but in this case this would not happen as it will occur only after polling both the audio in and out. This is consistent with having no more CPU used by the Overbridge thread.

What I'm not sure about USB is if different transfers can poll different endpoints at the same time with some interleaving. But if not, we were retrying for MIDI for too long I guess. Retrying for audio makes no sense as if there is no audio by the time we ask there will be a dropout and that can not be addressed.

lentferj commented 2 years ago

Along the way I >think< I found a solution to the j2o buffer overflows. My understanding (and that may be incorrect) is, that we are writing to a buffer that usually does not get read? So at some time point in time it is always full, no matter how huge we configure it. At least that was my finding... this patch should only deal with incoming audio on the Digitakt if the ports are actually connected. Again, just an example implementation. I don't know if my thinking is right at all diff --git a/src/jclient.c b/src/jclient.c index eb04653..b103890 100644

--- a/src/jclient.c
+++ b/src/jclient.c
@@ -190,6 +190,7 @@ jclient_j2o (struct jclient *jclient)
   long gen_frames;
   int inc;
   int frames;
+  int connected_ports = 0;
   size_t bytes;
   size_t wsj2o;
   static double j2o_acc = .0;
@@ -221,14 +222,27 @@ jclient_j2o (struct jclient *jclient)

   bytes = gen_frames * jclient->ob.j2o_frame_bytes;
   wsj2o = jack_ringbuffer_write_space (jclient->ob.j2o_rb);
-  if (bytes <= wsj2o)
+
+  for (int i = 0; i < jclient->ob.device_desc.inputs; i++)
     {
-      jack_ringbuffer_write (jclient->ob.j2o_rb,
-                            (void *) jclient->j2o_buf_out, bytes);
+      if (jack_port_connected(jclient->input_ports[i])) 
+       {
+          connected_ports +=1;
+        }
     }
-  else
+
+  if (connected_ports > 0)
     {
-      error_print ("j2o: Audio ring buffer overflow. Discarding data...\n");
+       printf("connected in ports on elektron device detected\n");
+       if (bytes <= wsj2o)
+         {
+            jack_ringbuffer_write (jclient->ob.j2o_rb,
+              (void *) jclient->j2o_buf_out, bytes);
+         }
+       else
+         {
+            error_print ("j2o: Audio ring buffer overflow. Discarding data...\n");
+         }
     }
 }

@@ -492,6 +506,10 @@ jclient_process_cb (jack_nframes_t nframes, void *arg)
   f = jclient->j2o_aux;
   for (int i = 0; i < jclient->ob.device_desc.inputs; i++)
     {
+      // printf("j2o - connected input ports: %d\n", jack_port_connected(jclient->input_ports[i]));
+      if (jack_port_connected(jclient->input_ports[i]) == 0) {
+        continue;
+      }
       buffer[i] = jack_port_get_buffer (jclient->input_ports[i], nframes);
     }
   for (int i = 0; i < nframes; i++)
lentferj commented 2 years ago

I can run my current patch set at -b10 without glitches or any MIDI timing drift so far. I will need to clean those up tomorrow and create a new PR.

dagargo commented 2 years ago

Along the way I >think< I found a solution to the j2o buffer overflows. My understanding (and that may be incorrect) is, that we are writing to a buffer that usually does not get read? So at some time point in time it is always full, no matter how huge we configure it.

It's exactly that. :100:

The way I see the problem is that if the ring buffer is full the reading ending is not reading and data is gonna be lost no matter what. That's what you said, but I just thought that dismissing the data was the easiest option. The problem is that it keeps printing that the buffer is full and that is undesirable.

However, there are 2 scenarios. If the other end is reading, discarding data will eventually solve this; if it's not, then we have the always printing issue.

During the last days I've been thinking of getting rid of this by having a static boolean variable that would be set whenever the buffer is full and cleared when writing can be performed. With this, we can test it before printing the error so we could only print it once. I think it's another approach to the same problem but I'm not sure if I understood it well.

And thank you, you're really keen on solving this. :+1: