jamulussoftware / jamulus

Jamulus enables musicians to perform real-time jam sessions over the internet.
https://jamulus.io
Other
997 stars 222 forks source link

Jamulus on Linux vs Windows #669

Closed seanogdelaney closed 3 years ago

seanogdelaney commented 3 years ago

TL;DR: Windows and Linux perform almost exactly the same running Jamulus on the same hardware. Windows was very slightly faster on my specific hardware. Typically, the difference in audio drivers will be the main factor (in my case, the networking was faster on Windows).


Hi,

I've used jack_iodelay (linux) and the RTL Utility (windows) to compare latencies. I'm using the same computer (dual boot), hardware, and AWS server for both tests.

My audio interface shows the same measured latency on both linux and windows (7.7ms at buffer size 64). Jamulus "overall delay" reports 22ms (linux) compared to 25ms (windows). Measurements of the true total latency are 27ms (linux) and 33ms (windows). Network jitter is typically +- 1ms. Both versions of jamulus were downloaded in the past week.

The network latency (ping 9ms), interface latency (7.7ms) and jitter buffer size (4+4, "small buffers") are constant. It seems that the linux jitter buffers account for 10ms of latency. The windows jitter buffers seem to account for 15ms of latency. Why is there a 50% difference here?

My collaborators use windows, and have slower networks, so a 5ms improvement would be great!

In any case, Jamulus developers should be very proud. Fantastic software.

Thanks.

P.S. Linux allows the user to choose 32 sample buffers, and this improves latency by an additional 3-4 ms. That does not seem to be possible on Windows with Focusrite ASIO drivers (Jamulus sets the buffer size, minimum 64).

[Dell Latitude 7390, Intel Core i7, 16GB RAM, Focusrite 4i4 3rd Gen, 300Mbps Fibre internet (wired), Ubuntu 20.04 (ALSA driver) and Windows 10 (Focusrite ASIO Driver)]

trombonepizza commented 3 years ago

is your QjackCTL configured for "realtime"?

seanogdelaney commented 3 years ago

Yes, and I'm using the low latency kernel on Ubuntu, and I'm running the Jamulus process with elevated priority.

trombonepizza commented 3 years ago

I'm not very well-versed on the Windows kernel as it relates to realtime processing, but I suspect this is part of the answer to your 50% question.

There is a kludge where you can assign "realtime" priority to a process in Windows using the task manager, but it will default back to normal when you kill the process, so I think you have to do it each time.

seanogdelaney commented 3 years ago

Thanks for the comment. I'm not so well versed with windows either, and have been planning to look into windows priority.

However, I wouldn't expect OS scheduling priority differences to produce a stable 5ms difference. I would expect a lot of jitter there. This effect seems to be more systematic.

corrados commented 3 years ago

I think Jamulus cannot do anything against this issue. You have USB drivers which influence the performance as well as the actual audio interface drivers. Both are completely different on Linux and Windows. With my Lexicon Omega audio interface I also have better latency on Linux compared to the Windows OS.

seanogdelaney commented 3 years ago

Thanks Volker.

I don't think this is a driver difference. My Focusrite interface has identical 7.7ms latency on both Linux and Windows (I measured it).

dzpex commented 3 years ago

The driver is important in my test with an audio card umc222 on windows i had very bad performance but in linux (manjaro exactly) the performance was very good i do not remeber the exactly difference, 15ms sure. The problem was on the driver because on windows the minimum supported sample was 176....on linux i decrease at 128 sample

seanogdelaney commented 3 years ago

Thanks dzpex. If you wish to reproduce this issue, we will need (a) Accurate measurements* of your audio hardware latency on Linux and Windows (b) Accurate measurements of your total Jamulus latency on Linux and Windows

In my case, there was no difference in (a) and a 5ms difference in (b). If you measure a difference in (a), you can subtract that from the difference in (b), to see if you can reproduce my result.

*It's important to measure yourself, because the figures reported by manufacturers / drivers are usually wrong. Measurement software: jack_iodelay is excellent on Linux, and RTL utility is excellent on windows.

pljones commented 3 years ago

Microsoft might be able to explain it better than anyone else. Yes, it's stable - they'll probably be overjoyed at that. That it's not as low as Linux won't matter to them - they've got a mass market product where that 5ms makes absolute no financial difference to them. (You don't, if you've any sense, use Windows on an embedded controller that needs to have "near real-time" response. However, the Linux low latency kernel is used commercially, I understand.)

50% extra can be the difference between 2 for your jitter buffers and 3, remember. So if timing on Windows requires greater provision than on Linux, your latency goes up. Indeed, the Windows timer resolution -- according to the comments in the code -- isn't as high as resolution as on Linux. I don't know if it's different enough to show that much effect though, as I've not measured.

dzpex commented 3 years ago

seanogdelaney..I do not have with me the hardware mentionend in my post because the test was executed on a my friend's PC and the umc222 is not mine...i use only linux, my client are on a raspberryPi4 and work very fine. I'm sorry i can't repeat the test.

seanogdelaney commented 3 years ago

Thanks pljones and dzpex.

How does Jamulus measure the ping time? Is it possible that Jamulus network packets are treated differently than ping network packets, because they are a different size or have a different header or something?

If the ping time reported by Jamulus is not the true network latency of Jamulus packets, that could account for some error. it might also suggest a way to improve the "overall delay" accuracy in Jamulus.

pljones commented 3 years ago

The "ping" is a Jamulus protocol message, sending the time it was created to the target, which simply sends it back. The sender then subtracts "now" from that to get the time. Jitter buffers actually shouldn't affect ping packet times at all, as they're not audio packets (which is what the jitter buffers handle). According to the "What's this?" help:

Overall Delay is calculated from the current Ping Time and the delay introduced by the current buffer settings.

... to improve the "overall delay" accuracy in Jamulus.

The only way would be to reduce the amount of audio carried in each audio packet to include a timestamp. But what benefit would this achieve? Why do it?

seanogdelaney commented 3 years ago

Thanks Peter.

The reason for my question is as follows. My audio signal has a higher latency on Windows (5ms). Therefore, Windows itself (or Jamulus on Windows) must be buffering my signal for an extra 5ms. I have verified that the extra buffering is not in the audio interface. The ping times are the same, so the buffering is not in the networking either. (Unless Windows networking treats ping and audio packets differently - and buffers the audio for 5ms).

If the extra buffering is not in (1) the audio hardware / driver (2) the networking or (3) the Jamulus sever (presumably it doesn't treat Windows clients differently), then it must be in the Jamulus client.

Does that seem logical?

bflamig commented 3 years ago

@seanogdelaney:

Your audio device latency measurements are interesting in light of the following:

I have a Scarlett 4i4 3rd Gen and a Behringer UMC404HD. Here are what the Jamulus code says the round-trip device latencies are, as returned by the devices themselves. (That is, Jamulus queries the drivers for these numbers.)

For the Windows numbers below, I set the "buffer settings" to 64 samples in the ASIO dialogs for both devices, (as well as in Jamulus):

Windows ASIO:

Scarlett 4i4: 184 samples for input, 184 samples for output = 386 samples ==> 368x1000/48000 = 7.67 ms UMC404HD: 136 samples for input, 72 samples for output = 208 samples ==> 208x1000/48000 = 4.33 ms

(1) The FocusRite 7.67 ms reported latency is what you measured (7.7ms) (2) The Behringer reports 3.3 ms less latency than the FocusRite. Now, a UMC 222 probably will not perform as well, especially if you don't use a good driver for it. (If there is one.)

Linux Jack (Rpi) -- using Jack settings of 64 frames/period and 2 periods

Scarlett 4i4: 64 samples for input, 128 samples for output = 192 samples ==> 192x1000/48000 = 4 ms UMC404HD: 64 samples for input, 128 samples for output = 192 samples ==> 192x1000/48000 = 4 ms

(1) So on Linux/Jack, both devices have same latency. (2) The latency for Scarlett is 3.67 ms higher on Windows than on Linux -- Linux scenario is much better. (3) It is interesting that your measurements report 7.7 ms for the Scarlett for both OS's, when the device itself says otherwise. (4) The latency for UMC404HD is 0.33 ms higher on Windows than on Linux -- Basically the same.

As far as Windows vs Linux performance: In my two setups -- a Windows machine and a Rpi machine, and connected to the same remote server -- I get a ping time of 16 ms for both machines, and an overall delay of 26 ms for both machines. This is after letting the jitter buffers settle down for a while (where they go to 2+2 on both machines.) So for me, Windows performs just as well as the Rpi, at least as far as the overall latency is concerned. What is different, is that the Rpi has fewer dropouts -- it's pretty darn stable. This is understandable, given it is a machine dedicated to Jamulus, as opposed to my fully loaded, general purpose Windows machine, where I haven't tried to pare down what is actually running on that machine.

pljones commented 3 years ago

Does that seem logical?

It doesn't answer the question. I asked why you need more accurate timing. It doesn't even sound like you've anything showing the timing to be inaccurate - simply different between two operating systems, which isn't unexpected as they work very differently.

seanogdelaney commented 3 years ago

Bryan, thanks for that additional information.

Peter, my interest in timing is to shed some light on the origin of the performance discrepancy I reported in my first post.

I realise that Windows and Linux work differently. However, the hardware dictates the limit of performance. If the software runs slower on Windows, it should be possible to understand why. If we can understand why, perhaps we can adjust something (e.g. windows settings / policies) to improve performance, and share that information for the benefit of users.

You may not consider this a worthwhile goal, or a high priority. No problem.

pljones commented 3 years ago

However, the hardware dictates the limit of performance.

However, the operating system dictates what can be measured reliably. And that impacts "performance" -- less reliable measurement = greater buffering = higher latency. It's that simple. And it's already documented in the code.

38github commented 3 years ago

I too have compared between Linux and Windows on the same computer and Linux with realtime setup gets me extremely stable and super low latency compared to Windows 10 and its proprietary drivers. It really is a major difference to me on my low-end laptop.

seanogdelaney commented 3 years ago

Thanks 38github. Many audio interfaces require larger buffers on windows, depending on system load. Glad to hear you found a stable solution.

Peter, thanks, I think I'm beginning to understand your point.

Are you saying that Jamulus allocates larger buffers on Windows than the buffer settings indicate? And that's necessary to deal with timing uncertainty, arising from a low resolution clock?

Or are you saying windows is doing the extra buffering? Which must be in the networking I guess? But Jamulus "ping" doesn't show the extra delay because of low time resolution?

I gather from a web search that Windows does have an API for high resolution time stamps, but I guess jamulus doesn't use that.

corrados commented 3 years ago

Just a quick cross-reference:

Measured timing jitter of my Lexicon Omega on Windows: https://github.com/corrados/jamulus/pull/529#issuecomment-679068229

Measured timing jitter of my Lexicon Omega on Linux: https://github.com/corrados/jamulus/pull/529#issuecomment-679196848

You can see there is a significant difference in jitter.

seanogdelaney commented 3 years ago

Thanks Volker.

Issue solved - this was an error I made with routing cables.

In fact Windows 10 has slightly better Jamulus latency than Ubuntu 20.04 (lowlatency kernel, realtime audio) in my case. The difference is just 2ms (approx).

The difference is in the network interface configuration, and I believe the Linux latency could be improved by adjusting various network interface options to match the Windows ones. The gaming websites have some useful tips (and many useless ones).

The error: On windows, I used 2 audio cables, to route the audio through both Jamulus and "RTL Utility". On Linux, I used just 1 cable, and made an extra connection in jackd. My windows test incurs the interface buffer latency twice, whereas the linux one incurs it just once.