dworkin / dgd

Dworkin's Game Driver, an object-oriented database management system originally used to run MUDs.
https://www.dworkin.nl/dgd/
GNU Affero General Public License v3.0
103 stars 31 forks source link

Callout timing off (as of May 4, 2020) #29

Closed shentino closed 4 years ago

shentino commented 4 years ago

Problem: Callout timing is messed up for callouts, they don't fire when they're supposed to.

I noticed my DumpD repeatedly noticing its lag spiking over its interval, and then ran a test with my ansi demo, and sure enough callout expiry times are getting hinky.

As a test, I cold booted kotaka and ran the same test, same result after a bit. Cold booting kotaka with the previous config::dsize fix version showed no problems.

It didn't crash so no stack trace is available.

To reproduce, use kotaka or another mudlib to start something in a repeating loop of callouts, cold booted to keep things simple. For kotaka I used the "ansidemo matrix" command

dworkin commented 4 years ago

As usual you are not giving us all the information. How is callout timing wrong? How does the ansidemo stuff demonstrate this? What am I looking for?

shentino commented 4 years ago

I thought I did, but I guess I suck at wording.

Ok, so what I mean is that callouts aren't expiring when they're supposed to. Expiration times are off, usually late, sometimes by several seconds.

The ansidemo thing runs a tight loop to display one frame of ansi encoded artwork every 1/60th of a second, and with the newest DGD it's glitchy. Even with a mostly unloaded machine, the framerate is erratic, and I noticed that my DumpD was off schedule by more than the configured dumping interval, which the DumpD detects when it compares its scheduled expiration time (passed as an argument after giving a delay of that minus now) with the actual time() on entry.

In contrast, the previous version of DGD usually fires callouts within milliseconds of their scheduled expiration.

Example: I set a callout to expire with a delay of 0.033, but it might expire in 1 or 2 whole seconds later. And for some reason my DumpD, which has an interval of 10 seconds, gets lagged for in excess of 40 seconds and notices that it expired way way late.

I noticed that the glitchy timing seems to have some sort of nexus to whole seconds.

shentino commented 4 years ago

Also I just noticed that reproducibility appears to be intermittent.

shentino commented 4 years ago

Ok so here's an example of a bad expiration: My dumpd callout, scheduled for 5 seconds at 4:40:20, expired 11 seconds later on 4:40:31.

I also noticed that sending I/O tends to "shake it loose" so maybe it's getting a bad timeout somewhere to supply to the select() call

shentino commented 4 years ago

Ok here's a really bad one, over a whole minute late.

Callout issued at 4:43:55 with an expiry of 5 seconds, doesn't go off until 4:45:01

dworkin commented 4 years ago

Should be fixed by https://github.com/dworkin/dgd/commit/30e976fd35b42fa50b8edf30a9308fe4c2694a06.

shentino commented 4 years ago

Fix confirmed.

shentino commented 4 years ago

I guess I can just call you Fix-it Felix now