supercollider / supercollider

An audio server, programming language, and IDE for sound synthesis and algorithmic composition.
http://supercollider.github.io
GNU General Public License v3.0
5.52k stars 750 forks source link

Linux timing instability #5353

Closed jamshark70 closed 3 years ago

jamshark70 commented 3 years ago

Environment

Steps to reproduce

Lately I am seeing semi-frequent, severe (performance-destroying) timing instability in SC.

Out of nowhere, messages are late:

late 0.004942441
late 0.006770537
late 0.006770537
late 0.004719433
late 0.008523114
late 0.008523114
late 0.006435532
late 0.010200291
late 0.013946215
late 0.013946215
late 0.011805406
late 0.011805406
late 0.017672947
late 0.013338161

And then suddenly it seems to get back to normal.

I suspect something is going wrong in syncOSCOffsetWithTimeOfDay(), because when it goes wrong, it seems to correct itself 20-ish seconds later. (Though, just now, SC audio got badly out of sync with VCV Rack audio and it didn't recover -- I had to abandon a screencast recording.)

This is new behavior as of the last month or two.

It's very bad. I used to trust SC onstage but this is now happening at least once in just about every live coding practice session I attempt. This damages confidence in the project.

I'll check relevant commits and revert, see what happens.

Any other ideas?

mossheim commented 3 years ago

~did this appear after you updated SC? if not, are you sure it isn't an issue with your system?~ sorry, just realized you listed the commit in your issue report. is there anything you've noticed that correlates with these late messages?

joshpar commented 3 years ago

The thing that rang a bell with this would be commit:

619fddfc2ba3d18a15b19cadbc3621001cff4b70

This had to do with the CoreAudio timing and reverting some other timing changes - however, I don’t think this is in develop yet. @James - which branch are you running a build off of?

Josh

On Feb 14, 2021, at 14:02, Brian Heim notifications@github.com wrote:

did this appear after you updated SC? if not, are you sure it isn't an issue with your system? is there anything you've noticed that correlates with these late messages?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/supercollider/supercollider/issues/5353#issuecomment-778848813, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAANHLAEJAUZMFDH43FVXTTS7BCAVANCNFSM4XTCELSA.

mossheim commented 3 years ago

James is on Linux (Ubuntu Studio), so that patch wouldn't be relevant in any case

jamshark70 commented 3 years ago

Thinking back through the history:

I'm considering hypotheses:

Otherwise, I'm puzzled.

smrg-lm commented 3 years ago
* Or, what if the OS is now more finicky about network time protocol? In August, I was not on a network -- no glitch. In December, we were using Link, so a network was required. We used a mobile hotspot; I don't recall whether I switched off mobile data. There may have been access to network time, or not. Lately, at home, I've been online. (This one is hard to guess for sure by history, because kernel updates could have changed the situation between August and now. I could try running for a few days on an older kernel.)

That is possible, iirc the algorithm trust the computer's time will not jump much and ntp should adjust the time gradually. If the host osc offset is out of sync packages may be late.

jamshark70 commented 3 years ago

Quick update -- I ran for a few days without the SC_JACK updates and I didn't see significant time glitching. That's suggestive but not conclusive.

This work week was my best time window to blow away my old Ubuntu Studio and upgrade to 20.04 before the new term, so "try older kernels" isn't gonna happen. I'm now running on the development branch (with SC_JACK updates) but I haven't yet done a longer coding session (because of configuring the new system and installing software). Current situation is unknown then, but I'll keep an eye on it.

yaxu commented 3 years ago

In my experience, unreliable wifi, e.g. of the kind found in performance venues full of electronic equipment and people using phones as hotspots, can definitely result in noticeable timing glitches from ntp. I find it best to switch off networking / ntp syncing before a performance. I wouldn't be able to say what sc version I've experienced it with.

mossheim commented 3 years ago

since there is not much to go off of here in the way of reproducibility, and it sounds like the problem was potentially due to something in @jamshark70 's environment not under SC's direct control, i propose we close this for now. maybe we can document it if it turns out to be a system or network configuration issue that reliably causes this?

jamshark70 commented 3 years ago

Yes, I'll close it for now. It's peculiar though: the environment (home WiFi) didn't change but the behavior did.

I'll reopen if I see this behavior again under Ubuntu Studio 20.04.

jamshark70 commented 3 years ago

Seen again in 20.04:

late 0.009174350
late 0.004747230
late 0.000300156
late 0.006850033
late 0.014785992
late 0.005599350
late 0.012048257
late 0.005432189
late 0.000752822
late 0.010346177
late 0.019086226
late 0.002451267
late 0.009666874
late 0.011325342
late 0.008942350
late 0.015283505
late 0.022394148
late 0.005712400
late 0.021354672
late 0.016521287
late 0.002964377
late 0.011682074
late 0.017978572
late 0.001990747
late 0.003099064
late 0.000660520
late 0.006928063
late 0.013891006
late 0.012413079
late 0.007507930
late 0.002600887
late 0.008854052
late 0.015773424
late 0.004353843
late 0.002750865
late 0.199375379  <<-- yikes, here was a really noticeable glitch
late 0.056518379

So I'll try what I did before (revert the sc_jack patch) and run it like that for a few days. Also I might try MIDI too. I'm not sure if it's sclang+scsynth, or scsynth only.

smrg-lm commented 3 years ago

@jamshark70 Ubuntu 20.04 runs a ntp server that can be deactivated with the following command to see if it helps.

timedatectl set-ntp off  # on to restore

However, with the ntp server running you can run this command in a terminal to see how system time affect Main.elapsedTime:

watch -n 1 sudo date -s"23:10:00"  # whatever time it is for you some minutes in the past for things to be late, I think.

That will change the system time that will be immediately updated by the ntp process every second. Then in sclang, from time to time, Main.elapsedTime will give some very bad numbers (elapsed time seems to go to the past and back to the future tm), I've tested with the routine below and with more or less difference between ntp time and forced time in the past. Sometimes the time server even gives up and the routine hangs for a while and then throws everything at the same time when in sync again.

(
r = Routine({
    var prev_elapsed = 0;
    var elapsed;
    loop {
        elapsed = Main.elapsedTime();
        postln([elapsed, elapsed - prev_elapsed]);
        prev_elapsed = elapsed;
        1.wait;
    };
});

r.play();
)

It may not be the timing offset as I thought but just the time read from the system. Also, in my laptop, the desktop clock takes some time before it updates to the right time after each boot, it is like the stored time or the hardware clock doesn't match the UTC offset (-3).

Edit: I can make it fail but I'm not sure how to measure it or configure the system.

jamshark70 commented 3 years ago

Ubuntu 20.04 runs a ntp server that can be deactivated with the following command to see if it helps.

OK, I'll try that next time I run a practice session (before modifying the SC binary)... or, when I'm in the office and I can run LinkClock alongside Ableton.

I think I won't concern myself too much with the edge case of a system clock that's being tampered with. The behavior you describe (of the SC scheduling queue pausing, or "catching up" with multiple simultaneous events) is what I would expect if the timebase is being altered by large amounts.

Also, in my laptop, the desktop clock takes some time before it updates to the right time after each boot...

I see this in XFCE (Ubuntu Studio) every time I wake from sleep. I think this is just a matter of the clock widget updating infrequently -- i.e. a bug in the widget where you would expect it to update the display immediately when the session is reactivated, but (I guess) it waits for the next minute boundary.

Good tip about disabling the network time server :+1:

yaxu commented 3 years ago

That fits with my experience here https://github.com/supercollider/supercollider/issues/5353#issuecomment-782921477

smrg-lm commented 3 years ago

I think this may be the issue:

https://sourceware.org/bugzilla/show_bug.cgi?id=14717 https://bugzilla.kernel.org/show_bug.cgi?id=112521

and it affect many other languages and programs.

jamshark70 commented 1 year ago

In case anyone finds this thread later: eventually I added this to my startup.scd:

if(Platform.ideName != "none") {
    fork {
        var cond = CondVar.new;
        var ntpStatus = false;
        ShutDown.add { "timedatectl set-ntp on".unixCmd };
        "timedatectl set-ntp off".unixCmd { |exit|
            if(exit == 0) {
                ntpStatus = true;
            };
            cond.signalAll;
        };
        cond.waitFor(1.0, { ntpStatus });
        if(ntpStatus.not) {
            "set-ntp off failed".warn;
        };
    };
};

... to disable ntp when running SC interactively, and re-enable it upon quit.