node-dmx / dmx

DMX controller library for node.js
MIT License
297 stars 96 forks source link

Realtime animations #83

Closed chetbox closed 4 years ago

chetbox commented 5 years ago

We've been using this library for DMX animations on LED tape and it's great! That said we found some shortcomings in DMX.Animation when it comes to driving DMX output in realtime. We need the ability to write a complex animation, that may or may not run many times and to know that it's exactly synchronised to the system clock. This allows coordination with videos, music, etc. With the current state of this library this is not possible: Animations seem to go on much longer than they should and there are a number of reasons for that.

These changes allow for much more reliable realtime operation in a way that should not affect existing users of the library. Please give it a try.

Timing issus

  1. Firstly, timing is measured by counting how many times we've run an "animation step". Each step may take a different amount of time so this is not a reliable way of measuring progress in the animation.
  2. setInterval is not a reliable way to run an operation at a known frame rate. In fact Javascript doesn't have a particularly great way of running something at a precise interval at all. The use of setInterval causes animations to "drift" in time further.
  3. Adding multiple "steps" (i.e. calling animation.add() multiple times) leads to time drift because we start a new animation from the time at which we happen to be starting execution of that step, not at the time is should have started based on the time the whole animation sequence started.
  4. Running the animation in a loop with .runLoop also causes drift due to the use of setInterval and also starting the whole animation sequence from the time the loop started.
  5. .add() defaults to a duration of 1ms with associated overhead of setting up setInterval, adding to the time drift.
  6. When writing to a serial port or socket, the operating system buffers any pending any writes and sends them when the previous data has been successfully sent. We found that even with the above issues fixed, the operating system buffered the writes and therefore continued to run the animation more slowly than requested after the program has stopped, even for relatively short animations. For drivers which send data only when updated, this is made worse by the library attempting an animation frame every millisecond, which means a lot of write operations get buffered and the OS catches up later.

Improvements

  1. Time is now measuring the elapsed time from a "start time" which is at the beginning of the whole sequence. During an animation loop we may complete multiple animation steps. This is now checked during each loop, ensuring that the final state of each completed animation step is sent if has not been already.
  2. setTimeout is used recursively instead of setInterval because this guarantees we are not running multiple callbacks in "parallel". There is some overhead so the loop will run slightly slower than the rate we request.
  3. Calling .add() records when the animation should start and end relative to initial "start time".
  4. Looping is now done by calculating the new "start time" from the previous start time and total duration of the loop. You can specify the target FPS (frames per second) which prevents the loop running more often than is necessary. (For our purposes 40 FPS works well.) The default is 1000 fps as before.
  5. Calling .add({1:0}) with no duration now defaults to a duration of 0. Multiple .add(...) operations are added as separate steps but executed together, along with any previous end state of an animation step that has just finished.
  6. Each driver now has the ability to skip a write if the the previous write is not complete. The end state of each step is never skipped, to ensure the final states of each channel are correct, however intermediate states during an animation are allowed to be dropped to so that if writes are running too slowly they can be caught up. Those drivers that do not use setInterval now have an extra option to the "update" methods called skipIfBusy. Update: All drivers now use setInterval but now each update checks that there are no pending DMX writes. If the previous write has not completed the current one is skipped and a future write updates all states.

I would recommend refactoring the drivers that use setInterval to work in a similar fashion to the others to make them more robust to this change. I only have an Enttex DMX Pro and I'm not sure if the other drivers are required to work differently so I've left the logic as is, however to aid realtime communication they now will not write if there is any pending write.

Other new features

You can adjust the frequency of the animation loop by specifying a target FPS:

You can adjust the refresh rate of each DMX driver using the dmx_speed option:

const dmx = new DMX();
const universe = dmx.addUniverse('lights', 'enttec-usb-dmx-pro', '/dev/tty.usbserial-EN257964', { dmx_speed: 40 });
new DMX.Animation()
    .add(...)
    .run(universe);
// As an optimisation "run()" will automatically adjust the refresh rate
// of the animation based on the refresh rate of the driver used

You can loop the entire animation a specific number of times:

new DMX.Animation({ loop: 4 });
// or
new DMX.Animation()
    .add(...)
    .runLoop(universe, () => { /* finished */ }, 4);

Some very simple Jest tests have been added. The plan was to add many more to test a whole lot of edge cases but working with Jest timers proved to be problematic and we had to opt for pragmatism.

Fensterbank commented 5 years ago

Hi @chetbox,

thank you so much for your awesome work. Before the pull request starts to get rotten, I've had a better look at it now. I just fixed some linting errors which have been introduced in some of the commits, so the tests are passing again.

Now I have to admit that I don't have a working DMX installation currently at hand, so I cannot test the changes properly.

But all your mentioned issues are obviously correct and I like the solutions you implemented. Since we are still below 1.x and as you said there are no relevant API changes, I think we could merge this and release it as 0.3. I think it may influence existing animations people have written in the past for their solutions, but as I understood the animation would still work, just "better". But of course it can influence existing animations in an unexpected way.

But yeah, this pull request makes a lot of sense to me and if this library cannot be used for realtime animations at the moment, it should definitley be fixed.

I assume, you are using the code of your branch already in production without issues? @wiedi and @bluemaex, any thoughts on this?

chetbox commented 5 years ago

Thanks for your feedback @Fensterbank, and thanks for fixing up the linting errors!

Yes, we're using this in production with a couple of clients and it works well.

If you're interested and have the appropriate hardware, I put together an example Electron app with an audio visualiser that's designed for LED strips. It seems as responsive as you would expect.

wiedi commented 5 years ago

Sorry for taking this long. It's a huge change and I haven't gone over it all just yet.

I feel your pain in timekeeping with javascript. It has been a problem in the past, especially with cheaper hardware where the imprecision is hard to work around.

But I don't think the design of this change is the right approach just yet. I see multiple problems, for a few I have possible solutions, for some I don't.

I hope to find some time soon to look a bit deeper into your changes and to better understand them. Maybe we can come up with a refined solution.

Thanks for working on this.

chetbox commented 5 years ago

Sorry for taking this long. It's a huge change and I haven't gone over it all just yet.

No problem, thanks for taking a look. I genuinely believe this change makes this library much more usable for many cases and am happy to take on board feedback.

I feel your pain in timekeeping with javascript. It has been a problem in the past, especially with cheaper hardware where the imprecision is hard to work around.

But I don't think the design of this change is the right approach just yet. I see multiple problems, for a few I have possible solutions, for some I don't.

  • setTimeout instead of setInterval is probably fine. If you have a process that drifts because of delays it might be possible to measure how much it drifted relativ to what it should have been and account for it in the next iteration.

I see what you mean here and, while I think it's possible to measure the drift and adjust the setTimeout delay, I'm not sure we gain a huge amount from doing that with this new design. The major change here is that during each callback the DMX state is calculated based on the original start time so drift is no longer an issue.

The main reason for changing setInterval to setTimeout is only to prevent the possibility of multiple setInterval callbacks running in parallel, which can happen if the callback runs for longer than the interval period. This isn't an issue in the drivers because we have a readyToWrite flag that acts as a mutex lock.

In practice, even with any overhead, the setTimeout recursion may run marginally slower than the requested interval, and may even drift a little (although all operations on drivers should be very fast because they're non-blocking) but the actual DMX values should be correct because of the way they're now calculated based on the start time so it make very little difference. Additionally, given this tick-based driver design, by default the animation update rate is double the rate of the driver update rate to ensure drift does not result in losing any DMX state updates due to the phase of the animation interval vs. the driver interval.

  • The thing I mostly disagree here is changing all drivers to be tick based. This is a step backwards from the tickless design. Timekeeping for animation steps should be done in the animation framework, with a defined resolution. This resolution defines how long each animation step takes. This might be an option when creating the animation. Maybe the drivers should expose a property so the animation framework can query for the highest supported update frequency or maybe a good enough resolution that works for all devices can be found. I'm not sure yet what the answer should be here.

I don't really like the tick-based drivers either, but this project contains some drivers that are tick-based and some that simply send DMX data over the serial port immediately. The latter is a major flaw of the current design because the operating system buffers serial port writes while another write is in progress. The tick-based pattern doesn't suffer from this problem and is more "realtime" as long as writes are skipped if the serial port is busy. This is quite a small change to the tick-based drivers and required changing the others to the same pattern. I opted to change the drivers to the tick-based design because this seemed like the smallest change that achieved the goal of "realtime" behaviour.

I'd prefer to rewrite the drivers to simply remember any pending writes and send them when possible, relying on the checking if the serial port write is complete to naturally determine the maximum update rate, however given I don't have all the appropriate hardware and there aren't any tests I'm a little wary of changing the structure of the them too much in case I break something.

I'm happy to attempt this change and ask folks to test on the relevant hardware if we think that's worth doing, or perhaps we could do that as a separate change later unless we believe there are any regressions or design decisions I've made that make this less reliable or usable?

chetbox commented 4 years ago

Hi folks, are there any further thoughts on this? Has anyone tried taking this for a spin?

You can install it with npm install github:chetbox/dmx#realtime-animations. Existing users of this library should not need to change any code and should notice the timing of their DMX animations be more reliable.

chetbox commented 4 years ago

Hi @Fensterbank, any further thoughts or testing on this? I believe it should be backwards compatible for all folks already using the library but with more reliable timing.

I've rebased on top of master and resolved conflicts to make sure it's mergeable.

scredii commented 4 years ago

@chetbox thanks for your work, your branch fix all my problems!

chetbox commented 4 years ago

Great to hear, @scredii!

@bluemaex, @wiedi, @Fensterbank, any more thoughts on merging this?

Fensterbank commented 4 years ago

Hi @chetbox,

as you can probably imagine, none of us had time to redesign approaches and rethink things. Just like it is usual with voluntarily managed open source projects. ;)

Since several people including your clients are using your branch and it is obviously working without compatibility flaws for existing users, I just talked with @wiedi and we decided to just merge it.

A new npm release is coming tomorrow or so. Sorry for the delay and thanks again for your work! 🍪🍪🍪

chetbox commented 4 years ago

Thanks, that's much appreciated. We're using this library more and more with a few different drivers now. We will of course PR any further fixes or improvements back.