jmamma / MCL

MCL firmware for the MegaCommand MIDI Controller.
BSD 2-Clause "Simplified" License
53 stars 9 forks source link

Defer rendering. #133

Closed jmamma closed 3 years ago

jmamma commented 4 years ago

When DEFER_SEQ is enabled, the sequencer is rendered into one of two MIDI ring side_channel buffers.

On the next Seq() event the MidiUart.txRb_sidechannel is pointed to the sequencer buffer rendered on the last Seq() event. This triggers the UART to drain the buffer until it is empty, at which point the side_channel buffer is disabled and the UART reverts to transmitting data from the main MidiUart.txRb buffer.

Two side channel buffers are used such that the sequencer can render to one, whilst the other is drained. On the next Seq() run the buffers are swapped.

Why?

In theory, this should reduce latency between sequencer events.

Issues:

The side channel buffer is drained as the adjacent sequencer buffer is rendered. This may still cause latency due to context switching.

When the sequencer is stopped or paused, we must re-render the sequencer prior to start/continue.

RAM usage. 2 side channel buffers are used in addition to the main TX buffer.

jmamma commented 4 years ago

It looks like this works fine.

Compile size is 732 bytes extra when DEFER_SEQ is defined.

--

@yatli We need to profile the performance of this mode. But it means that we should no longer be worried about seq() execution time... can pack as many tracks in as we like, provided they render before the next tick.

jmamma commented 4 years ago

Sequencer completion time + jitter profiling:

https://github.com/jmamma/MIDICtrl20_MegaCommand/pull/125#issuecomment-671036024

jmamma commented 4 years ago

31250bps/ 10 = 3125 Bps 3125 * 8 = 25,000 Bps @ 8x TURBO

1/25,000 * 1000 = .04 ms to transfer 1 byte

Trig track = 3 bytes * 0.04ms = 0.12ms

Trig 16 tracks = 1.92 ms.

jmamma commented 4 years ago

Scope is measuring 1.860ms to transmit 16 tracks. 0 jitter.

Woah.

jmamma commented 4 years ago

MD crash/lockup scenario: Trig sixteen tracks then rapidly adjust Mixer levels. Can't replicate with #DEFER_SEQ disabled.

Deferred buffers are probably swapping in before complete MIDI message

jmamma commented 4 years ago

I just overhauled how MCL handles MIDI byte ordering. The MD is very sensitive to this and will crash/lockup if it receives incomplete messages.

Because the Seq side_channel Buffers are independent of the main TX buffer there is no longer the risk of the sequencer ISR inserting data mid message stream. We can now remove USE_LOCK/SET_LOCK in some of the message transmit routines.

There was one additional problem. We must not switch to the side chain buffer when the main buffer is in the middle of a message. So I added code to detect message transmission completion. This now resides in the transmit ISRs.

jmamma commented 4 years ago

Final scope measurement to trig 16 tracks:

1x TURBO: 9.6ms 2x TURBO: 7.3ms 4x TURBO: 3.68ms 8x TURBO: 1.852ms

At 128BPM
B = 1 beat is ((1/(128 / 60)) = 0.46875 seconds long
Edit: calcs were wrong.
(B / 4) / 6 = 19.5ms per MIDI_CLOCK
9.75 ms per div192.

At higher tempos and low TURBO speed, there is the possibility that the side_channel buffer does not drain completely before the swap. Will need to add in code to copy contents of one buffer into the other at start of Seq()

jmamma commented 4 years ago

Realtime vs Deferred rendering

I've been analysing seq() performance on ext_conv and defer rendering branches. The goal is determine whether this new implementation is better.

Our main concerns are latency and jitter.

Latency: The delay between the clock event, and the resulting output of the sequencer. (trig vs response) Jitter: The variation in timing between the output of adjacent sequencer events.

In this analysis we will be looking at Jitter. Specifically the jitter between 8 MIDI note messages occuring on the same sequencer step.

Test method: Trigger 8 tracks simultaneously on every step. Toggle port output and observe timing on scope.

ext_conv test:

  for (uint8_t i = 0; i < num_md_tracks; i++) {
    PORTB |= (1 << PB5)
    md_tracks[i].seq(uart)
    PORTB &= ~(1 << PB5);
  }

md_tracks[i].seq(uart) will usually complete in approximately 12us with 1us of jitter. sporadic variations in completion time were observed at 100-200+ us.

defer_rendering test:

383     if (!MidiUart.txRb_sidechannel->isEmpty_isr()) {
384       MidiUart.sendActiveSenseTimer = MidiUart.sendActiveSenseTimeout;
385       UART_WRITE_CHAR(MidiUart.txRb_sidechannel->get_h_isr());
386       //PORTB |= (1 << PB5);
387     }
388     if (MidiUart.txRb_sidechannel->isEmpty_isr()) {
389       //PORTB &= ~(1 << PB5);
390       MidiUart.txRb_sidechannel = nullptr;
391     }
392   } else {

Rendering time is not relevant here as the sequencer data is pre-rendered in to a transmit side_channel buffer.

In this test we measure the time required to drain the entire transmit side_channel buffer. The scope shows a consistent measurement, with no variation. This implies no jitter.

--

Discussion:

With the original real time sequencing method, data is rendered and then immediately written to the MidiUart.tx buffer. The buffer is drained via ISR trigger, and will continue to empty asynchronously during the remaining run of the sequencer. That is, as soon as data is written to the MidiUart buffer, it will be transmitted at a rate according to the port speed.

Jitter will only become a noticeable problem if rendering time is greater than the MIDI message transmission time. If the sequencer event can't complete before the transmission of the last MIDI messages then there will be a duration of time in which the transmit buffer is empty, leading to jitter between sequential message transmissions.

At 8x turbo it takes (0.4ms * 3) 120us to transmit a note_on message.

From oscilloscope observation it appears that the rendering time of md_tracks[i].seq(uart) is unpredictable. It can take up to 200us, which exceeds the 120us threshold.

At 1x turbo it takes 960us to transmit a note_on message. As such rendering time should have no effect on MIDI message jitter here.

Further Considerations:

Longer rendering time increases latency for transmission of Ext tracks. ExtTracks are rendered after MDTracks, and thus their transmission start time depends on the completion time of previous sequencer events.

The deferred rendering approach will drain both UART1 and UART2 side channel buffers simultaneously eliminating this latency.

jmamma commented 4 years ago

Phase shifted when syncing to A4 on UART2. I suspect the A4 is transmitting midi start + midi clock at same time. This means there is no time for the sequencer to render the first tick.

we will need to pre-render midi_stop condition.

yatli commented 4 years ago

Does the measurement reading align with what we get from the diagnostic page?

jmamma commented 4 years ago

Which measurement reading?

jmamma commented 4 years ago

Too much additional code/complexity to pre-render both MIDI_START and MIDI_CONTINUE possibilities.

Easier method is to always sequence the first tick in realtime on MIDI START.

--

Phase shift is eliminated...

yatli commented 4 years ago

Which measurement reading?

The time between two pin toggles.

jmamma commented 4 years ago

I was able to measure the non-deferred/realtime sequencer completion time vs the transmission time of the deferred buffer.

The deferred buffer transmission time is always constant, whereas the non-deferred/realtime sequence competition time fluctuates (as seen in our earlier measurements).

I then postulate that the fluctuations in realtime sequencer performance would only be relevant if they exceed the MIDI message transmission time, as dictated by the port speed.

In other words, if the sequencer code for one event is fast enough that it completes before the MIDI message is transmitted, then the effects of sequencer jitter (variations in completion time of seqeuncer events) would be negated.

yatli commented 3 years ago

I think we should change merge target to dev? piano_roll has been merged.

jmamma commented 3 years ago

done.

jmamma commented 3 years ago

project conversion is failing in this branch when debug mode is off.

new_project_master_file(char *projectname); does not execute/complete?

352 DIAG_PRINTLN() and the bug goes away

😭

jmamma commented 3 years ago

I'm now getting memory corruption after new project creation. Can replicate with DEBUG on/off

 71     DEBUG_PRINTLN("Here");
 72     DEBUG_PRINTLN(newprj);
 73     if (!new_project(newprj)) {
 74       goto again;
 75     }
 76     DEBUG_PRINTLN("There");
 77     DEBUG_PRINTLN(newprj);

WTF

13:24:49.857 -> Here
13:24:49.857 -> project264  <--- LINE 72
13:24:53.714 -> project264.mcl <--- println(newprj) before return statement in new_project().
13:24:53.714 -> There
13:24:53.714 -> �⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮  <----- LINE 77 

It's like the SP is not returning to the right place after the call to new_project()

jmamma commented 3 years ago
13:52:47.979 -> Here
13:52:47.979 -> project667
13:52:47.979 -> 8666 <- (uint16_t) SP
13:52:47.979 -> 8681 <- (uint16_t) *newprj

13:52:51.272 -> There
13:52:51.272 -> ⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮
13:52:51.272 -> 8666  <- (uint16_t) SP
13:52:51.272 -> 233  <- (uint16_t) *newprj

Compiler is changing the memory address of newprj ?????

jmamma commented 3 years ago

Tracked down to unsafe use of strcat ? and string array length being shorter than they should be.

jmamma commented 3 years ago

Both dev and defer_rendering branches have some issue with respect to chain mode.

jmamma commented 3 years ago

I am mostly confident that these issues are now resolved.

Will do more testing tomorrow.

Chainmode bugs == nofun.

jmamma commented 3 years ago

Small instability at 1x MIDI. On occasion when manual chanining, MD will change pattern due to corrupt MIDI data. Probability is in the order of 1/50 chains.

1x performance is sluggish on dev branch, but performs well in this branch otherwise.

jmamma commented 3 years ago

I couldn't find any evidence of program change message being transmitted as a result of truncated MIDI message.

However I did notice quite a few 0xFF (MIDI RESET) messages being interleaved between control channel messages which I traced to certain track's parameter locks. I also saw a single 0xFF transmitted at the end of active peering??

I've put a 0x7F filter on sendMessage and these have now cleared up.

What's strange is that my now converted projects sound different to their original. So we must have been transmitting bad lock data all this time... Need to work out how to make the lock data sound right again.

Still testing stability.

jmamma commented 3 years ago

another thing I noticed. 0xFE active sense is not sent in 1x mode. If you switch to 8x turbo and then switch down to 1x, active sense messages are still sent.

edit: fixed.

jmamma commented 3 years ago

I am mostly confident that these issues are now resolved. Bad machine transmission is still happening at higher tempos.

Unexpected program change is happening at 1x TURBO. it's likely a bug on the MD end, as the MIDI output coming from the MC is not truncated.

jmamma commented 3 years ago

What's strange is that my now converted projects sound different to their original. So we must have been transmitting bad lock data all this time... Need to work out how to make the lock data sound right again.

traced to bad lock conversion in project conversion.

jmamma commented 3 years ago

8d7ca7f probably not true.

might have to abort this branch as it is too unstable for release.

jmamma commented 3 years ago

So this becomes stable when I drop down to 4x Turbo. I'm starting to think the problem is on the MD side.

jmamma commented 3 years ago

Instability is due to chaining across the FX data.. these are sequential SYSEX messages.

jmamma commented 3 years ago

still unstable after memory slot refactor.