ncassetta / NiCMidi

A MIDI C++ library with objects for reading, writing, playing, editing and recording midi files
GNU Lesser General Public License v3.0
2 stars 1 forks source link

Hunting for a bug that causes frequent loss of Midi input messages #13

Closed goofy2k closed 2 years ago

goofy2k commented 2 years ago

I am hunting a bug that causes "loss of midi messages" or, maybe better, "unhandled midi messages". They may very well have to do with my implementation of midi input, midi_output or thru.cpp. Still I post this here as you may have useful advice or (less expected) it helps to improve your code.

High level characteristics:

Missing note-off or note-on commands on my synth board.

On the NicMidi side the fingerprints of the errors are different. Here I describe only the missing note-off case as it is easier to detect (by listening) and reproduce (in the midi-thru example that accepts key-presses over Midi In (implemented with MQTT).

MISSING NOTE-OFF

These are the very last log lines. They all come from driver.cpp.

I (52701) HardwareMsgIn (driver.cpp): !msg.IsNoOp BEFORE PROCESSOR 0 : Ch 0 NOTE OFF Note 41 Vel 0 I (52711) HardwareMsgIn (driver.cpp): !msg.IsNoOp NO action by PROCESSOR required E (52711) HardwareMsgIn (driver.cpp): added message to in_queue Got message, queue size: 2

No further logs until a next command is sent. This means that in the next calls of the thru component an empty queue is detected. Otherwise thru would create a "Message found" log and some more during handling of the message

My interpretation / analysis:

In summary, my guess is that NiCMidi is interupted AFTER starting the handler loop in MIDIThru (thus handling the queue with only one message), but BEFORE flushing the queue.

It may well be possible that when a message arrives over MQTT, the MQTT event interrupts the thru tickproc. I don't know (yet) which mechanism is used by the MQTT client and if the locking in the MIDIThru::TickProc with
proc_lock.lock(); MIDIInDriver* in_driver = MIDIManager::GetInDriver(in_port); in_driver->LockQueue();
is sufficient.

I would appreciate your comments / suggestions.

goofy2k commented 2 years ago

I studied the recursive_mutex used in driver and thru and tried to use try_lock instead of lock in driver and thru in several combinations. Without success.

Now I am thinking of the following direction for a solution:

The current handler in thru.cpp uses a for loop over the number of messages in the input queue. If my above analysis is correct, it could happen that a note-off (or any other command) is added to the cue while the for loop is running for the old number of messages in the queue (mostly 1). What if you used a while loop and add a "handled" flag to each message? The loop then could run while there are still unhandled messages in the queue. As the number of Tick components may vary, a flag for each of them should probably be used.

It still puzzles me how a message can be added to the queue when thru has blocked it with in_driver->LockQueue(). Or how MIDIThru::TickProc can be interrupted when it is blocked by proc_lock.lock().

goofy2k commented 2 years ago

I have thought about the suggested solution with flags and think that I will give it a try.

For each message I can create a flag "fresh message" when the message is inserted into the input queue. In manager, before entering the loop over tick components it is first checked if a message is fresh. If so, for the message a set of flags is created for each tick component. One of these flags is set when the message is handled by the corresponding tickproc.

In manager, messages are handled in the sequence of tickprocedures, but only if not already done. The for loop over messages (and maybe the for loop over tickprocedures) is replaced by a while unhandled loop.

Even if the root cause (and solution) is in mutexes/locking, this exercise may help me to find the root cause.

goofy2k commented 2 years ago

Which is not that easy :-(

I'll try to find out why (and if) an incoming message can be added to the in_queue while the queue is handled and should be locked .

ncassetta commented 2 years ago

I think what you say could happen. MIDIManager::TickProc() does not lock the MIDIInDriver queue and therefore a message may arrive after the end of MIDIThru::TickProc() and may be discarded at the end. A quick and dirty test could be this: 1) Use the sample file test_thru.cpp (so there is only one MIDITickComponent)

2) In the file thru.cpp: 198 add the line:

    for (unsigned int i = 0; i < in_driver->GetQueueSize(); i++) {
        std::cout << "Message found\n";
        in_driver->ReadMessage(rmsg, i);
        msg = rmsg.msg;
        if (msg.IsChannelMsg()) {
            if (in_channel == msg.GetChannel() || in_channel == -1) {
                if (out_channel != -1)
                    msg.SetChannel(out_channel);
                if (processor)
                    processor->Process(&msg);
                out_driver->OutputMessage(msg);
            }
        }
    }
    in_driver-> FlushQueue()    // ADD THIS
    in_driver->UnlockQueue();

(so the queue is flushed by the MIDIThru)

3) Comment the lines in MIDIManager::TickProc() which flush the queue:

    void MIDIManager::TickProc(tMsecs sys_time, void* p) {
        if (!init)
            return;
    proc_lock->lock();
    for (unsigned int i = 0; i < MIDITicks->size(); i++) {
        MIDITickComponent* tp = (*MIDITicks)[i];
        if (tp->IsPlaying())
            tp->GetFunc()(sys_time, tp);
    }

    //for (unsigned int i = 0; i < MIDI_ins->size(); i++)
        //if ((*MIDI_ins)[i]->IsPortOpen())
            //(*MIDI_ins)[i]->FlushQueue();
    proc_lock->unlock();

    //std::cout << "MIDIManager::TickProc" << std::endl;
}

So the queue is flushed by the thru and not by the manager. Obviously this surely gives problems if there are more than a MIDITickComponent (for example a thru and a recorder) or if the components are off

ncassetta commented 2 years ago

The right way should be to put a flag to the MIDIRawMessage class which indicates how many times it has been read. But this should take into account how many "readers" are on in the manager, and so also the MIDITickComponent and MIDIManager classes should be restructured (the MIDIThru::Start() method should tell the MIDIManager "there is a new message reader", and the Stop() "the reader has stopped". I'll try this next week

goofy2k commented 2 years ago

Thanks.

Re you very last message:

I indeed started to put flags into the MIDIRawMessage class. But then I noticed that the flags should have an identifier in it to remember which MidiTick component already processed it (so a map, rather than a vector for storing flags). The map keys remove the need to know how many and which Tick components are in the stack. Each component can then decide on it's own if it still needs to process a message. Every MidiTick component should be adapted to to set this flag when it encounters a message. A potential issue would be that, with a while loop, the manager loop could become endless when messages arrive very frequent.

Re the message before the very last: I will try this. Not sure if this is sufficient.

I was also thinking to create a specialised MidiTick component that flushes the queue. This should always be called as the last one in the stack. However: this is probably equivalent to flushing the queue in the manager as it is done now.

ncassetta commented 2 years ago

Another simple preliminary test you could do for ensuring that you have identified the problem could be to lock/unlock the queue in the MIDIManager::TickProc(): perhaps this could give lesser recording quality, but could solve the problem of lost messages:

void MIDIManager::TickProc(tMsecs sys_time, void* p) {
    if (!init)
        return;
    proc_lock->lock();
    for (unsigned int i = 0; i < MIDI_ins->size(); i++)    // ADDED THESE
        if ((*MIDI_ins)[i]->IsPortOpen())                 //
            (*MIDI_ins)[i]->LockQueue();                  //

    for (unsigned int i = 0; i < MIDITicks->size(); i++) {
        MIDITickComponent* tp = (*MIDITicks)[i];
        if (tp->IsPlaying())
            tp->GetFunc()(sys_time, tp);
    }

    for (unsigned int i = 0; i < MIDI_ins->size(); i++)
        if ((*MIDI_ins)[i]->IsPortOpen()) {
            (*MIDI_ins)[i]->FlushQueue();
            (*MIDI_ins)[i]->UnlockQueue();           // ADDED THIS
        }
    proc_lock->unlock();

    //std::cout << "MIDIManager::TickProc" << std::endl;
}

So non new message between the MIDITickComponent::TickProc() edures and the queue flushing

goofy2k commented 2 years ago

This last one certainly helps! Because I use Nodered for sending Midi messages I can control the duration of key-presses well. Without this addition I had frequent message losses when playing well-separated key-presses with 250 ms duration. At 500 ms there were losses now and then. With the addition I can now play notes with a duration of 25 ms and 10 ms !

This is all in midi thru mode, no other Tick components. I will do some testing with the advanced sequencer example and recording too. Of course there are no incoming messages with the sequencer example, but you never know where the pitfalls are ;-)

[EDIT] No problems encountered with the advanced sequencer and recording examples! I recorded notes with a set duration of 10 ms! I did not analyze how well this setpoint translates in real durations. For short durations there may be some variation in the actual duration.

goofy2k commented 2 years ago

No more problems lately!