mrrwa / LocoNet

An embedded Loconet interface library for Arduino family microcontrollers
Other
68 stars 32 forks source link

Missed messages on Arduino Mega but not Uno. #23

Closed habazut closed 3 years ago

habazut commented 3 years ago

I am using version 1.1.4 of the library. Hardware is the Fremo LN Shield.

I am writing an own code to populate a slot table and receive messages from an Digitrax UT4. My tests indicate that I receive all LocoNet messages when I run the code on an Uno. When I move the same shield to a Mega I seem to miss some messages, typically the ones that seem to be sent by the UT4 directly as a reply . Typically the move slot (set slot used) and the write slot messages. On a LocoBuffer that logs to JMRI I can see the messages so they are sent from the UT4. The RxError counter increases at the same time, so I think the sw-uart is behaving differently between Uno and Mega. Could it be too slow to turn around from TX to RX? But why then different on Mega?

Typical capture:

01:57:18.996: [BF 00 2C 6C]  Request slot for loco address 44 (short).
01:57:19.068: [E7 0E 02 03 2C 00 00 07 00 00 00 00 00 3C]  Report of slot 2 information:
    Loco 44 (short) is Not Consisted, Free, operating in 128 SS mode, and is moving Forward at speed 0,
    F0=Off, F1=Off, F2=Off, F3=Off, F4=Off, F5=Off, F6=Off, F7=Off, F8=Off
    Master supports LocoNet 1.1; Track Status: On/Running; Programming Track Status: Available; STAT2=0x00, ThrottleID=0x00 0x00 (0).
01:57:19.076: [BA 02 02 45]  Set status of slot 2 to IN_USE.
01:57:19.104: [E7 0E 02 33 2C 00 00 07 00 00 00 00 00 0C]  Report of slot 2 information:
    Loco 44 (short) is Not Consisted, In-Use, operating in 128 SS mode, and is moving Forward at speed 0,
    F0=Off, F1=Off, F2=Off, F3=Off, F4=Off, F5=Off, F6=Off, F7=Off, F8=Off
    Master supports LocoNet 1.1; Track Status: On/Running; Programming Track Status: Available; STAT2=0x00, ThrottleID=0x00 0x00 (0).
01:57:19.115: [EF 0E 02 33 2C 00 00 07 00 00 00 7F 75 0E]  Write slot 2 information:
    Loco 44 (short) is Not Consisted, In-Use, operating in 128 SS mode, and is moving Forward at speed 0,
    F0=Off, F1=Off, F2=Off, F3=Off, F4=Off, F5=Off, F6=Off, F7=Off, F8=Off
    Master supports LocoNet 1.1; Track Status: On/Running; Programming Track Status: Available; STAT2=0x00, ThrottleID=0x75 0x7F (15103).

The Mega misses the "BA" mostly and if it get's the "BA" it misses then almost always the "EF" :-(

Current state of the code is https://github.com/habazut/CommandStation-EX/tree/79b12fbb7f603299f59dc0548ee931e51d3be1b9 and the relevant stuff is in LNet.cpp.

I need this to move to the Mega as Timer1 is used by something else in this project, so the tests on the Uno are just tests for the LocoNet functionality, nothing else.

Regards, Harald.

PS: While looking around I found:

#ifdef LN_INIT_COMPARATOR
  LN_INIT_COMPARATOR(); 
#else
  // First Enable the Analog Comparitor Power, 
  // Set the mode to Falling Edge
  // Enable Analog Comparator to Trigger the Input Capture unit
  // ACSR = (1<<ACI) | (1<<ACIS1) | (1<<ACIC) ;

  // Turn off the Analog Comparator
  ACSR = 1<<ACD;
  // The noise canceler is enabled by setting the Input Capture Noise Canceler (ICNCn) bit in 
  // Timer/Counter Control Register B (TCCRnB). When enabled the noise canceler introduces addi- 
  // tional four system clock cycles of delay from a change applied to the input, to the update of the 
  // ICRn Register. The noise canceler uses the system clock and is therefore not affected by the 
  // prescaler.
  TCCR1B |= (1<<ICNC1) ;                // Enable Noise Canceler 
#endif

but I think this is dead code as LN_INIT_COMPARATOR is always defined. I have no idea if the noise canceler is something one wants to enable or not but if it should be, that should be in the macro!

kiwi64ajs commented 3 years ago

Hi Harald,

Without looking hard at your code I expect the AVR is just getting too busy with interrupts to handle the bit-bashed LocoNet as well as generating DCC bit-stream.

My plan for may years has been to move to a LocoNet2 library that uses the hardware UARTs to relieve the CPU load and also support newer chipsets that have nice features for detecting a busy LocoNet to better avoid collisions.

Hope that helps.

Alex Shepherd

On 29/03/2021, at 1:56 AM, habazut @.***> wrote:

I am using version 1.1.4 of the library. Hardware is the Fremo LN Shield.

I am writing an own code to populate a slot table and receive messages from an Digitrax UT4. My tests indicate that I receive all LocoNet messages when I run the code on an Uno. When I move the same shield to a Mega I seem to miss some messages, typically the ones that seem to be sent by the UT4 directly as a reply . Typically the move slot (set slot used) and the write slot messages. On a LocoBuffer that logs to JMRI I can see the messages so they are sent from the UT4. The RxError counter increases at the same time, so I think the sw-uart is behaving differently between Uno and Mega. Could it be too slow to turn around from TX to RX? But why then different on Mega?

Typical capture:

01:57:18.996: [BF 00 2C 6C] Request slot for loco address 44 (short). 01:57:19.068: [E7 0E 02 03 2C 00 00 07 00 00 00 00 00 3C] Report of slot 2 information: Loco 44 (short) is Not Consisted, Free, operating in 128 SS mode, and is moving Forward at speed 0, F0=Off, F1=Off, F2=Off, F3=Off, F4=Off, F5=Off, F6=Off, F7=Off, F8=Off Master supports LocoNet 1.1; Track Status: On/Running; Programming Track Status: Available; STAT2=0x00, ThrottleID=0x00 0x00 (0). 01:57:19.076: [BA 02 02 45] Set status of slot 2 to IN_USE. 01:57:19.104: [E7 0E 02 33 2C 00 00 07 00 00 00 00 00 0C] Report of slot 2 information: Loco 44 (short) is Not Consisted, In-Use, operating in 128 SS mode, and is moving Forward at speed 0, F0=Off, F1=Off, F2=Off, F3=Off, F4=Off, F5=Off, F6=Off, F7=Off, F8=Off Master supports LocoNet 1.1; Track Status: On/Running; Programming Track Status: Available; STAT2=0x00, ThrottleID=0x00 0x00 (0). 01:57:19.115: [EF 0E 02 33 2C 00 00 07 00 00 00 7F 75 0E] Write slot 2 information: Loco 44 (short) is Not Consisted, In-Use, operating in 128 SS mode, and is moving Forward at speed 0, F0=Off, F1=Off, F2=Off, F3=Off, F4=Off, F5=Off, F6=Off, F7=Off, F8=Off Master supports LocoNet 1.1; Track Status: On/Running; Programming Track Status: Available; STAT2=0x00, ThrottleID=0x75 0x7F (15103). The Mega misses the "BA" mostly and if it get's the "BA" it misses then almost always the "EF" :-(

Does the software uart need to be adjusted somehow? Any other version I should try? Any tips how to debug this? (I am more fluent in software but I own a scope - but I don't know how to attach a probe so to say to the sw-uart to compare it's timing with the bits on the LocoNet) Current state of the code is https://github.com/habazut/CommandStation-EX/tree/79b12fbb7f603299f59dc0548ee931e51d3be1b9 https://github.com/habazut/CommandStation-EX/tree/79b12fbb7f603299f59dc0548ee931e51d3be1b9 and the relevant stuff is in LNet.cpp.

I need this to move to the Mega as Timer1 is used by something else in this project, so the tests on the Uno are just tests for the LocoNet functionality, nothing else.

Regards, Harald.

PS: While looking around I found:

ifdef LN_INIT_COMPARATOR

LN_INIT_COMPARATOR();

else

// First Enable the Analog Comparitor Power, // Set the mode to Falling Edge // Enable Analog Comparator to Trigger the Input Capture unit // ACSR = (1<<ACI) | (1<<ACIS1) | (1<<ACIC) ;

// Turn off the Analog Comparator ACSR = 1<<ACD; // The noise canceler is enabled by setting the Input Capture Noise Canceler (ICNCn) bit in // Timer/Counter Control Register B (TCCRnB). When enabled the noise canceler introduces addi- // tional four system clock cycles of delay from a change applied to the input, to the update of the // ICRn Register. The noise canceler uses the system clock and is therefore not affected by the // prescaler. TCCR1B |= (1<<ICNC1) ; // Enable Noise Canceler

endif

but I think this is dead code as LN_INIT_COMPARATOR is always defined. I have no idea if the noise canceler is something one wants to enable or not but if it should be, that should be in the macro!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mrrwa/LocoNet/issues/23, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5Y53N4BIZBSLKOJDPDRN3TF4YSFANCNFSM4Z6AK2YA.

habazut commented 3 years ago

Thanks for the answer. Yes, the AVR is at the same time handling the generation of a DCC signal (DCC++EX) and yes I think that I see missed and late interrupts. I have been as well thinking in terms of using the HW UARTs (of which the Mega has 3 extra). So far I have had success in routing the receive line trough an UART but that does not give me an interrupt for collision handling. Looks like the standard Serial library does not like to let me attach an interrupt to do that. I just today found https://github.com/SlashDevin/NeoHWSerial which might help to attach an interrupt to handle the backoff when necessary.

My plan for may years has been to move to a LocoNet2 library that uses the hardware UARTs to relieve the CPU load and also support newer chipsets that have nice features for detecting a busy LocoNet to better avoid collisions.

That would be great. I don't know if I can help (I can program, but my understanding of AVR is in it's infancy).

Regards, Harald.

PS: Is there any git magic that could give me Linux line endings and the Windows folks their line endings? I think I heard something that there is such a thing but I don't remember exacly what needs to be done to the repo.

kiwi64ajs commented 3 years ago

Hi Harald,

On 3/04/2021, at 5:11 AM, habazut @.***> wrote:

Thanks for the answer. Yes, the AVR is at the same time handling the generation of a DCC signal (DCC++EX) and yes I think that I see missed and late interrupts. I have been as well thinking in terms of using the HW UARTs (of which the Mega has 3 extra). So far I have had success in routing the receive line trough an UART but that does not give me an interrupt for collision handling. Looks like the standard Serial library does not like to let me attach an interrupt to do that. I just today found https://github.com/SlashDevin/NeoHWSerial https://github.com/SlashDevin/NeoHWSerial which might help to attach an interrupt to handle the backoff when necessary.

I checked and that just lets you add a call-back during the interrupt handling to do something with the newly received char - its still after the the RX is Complete.

I was wondering about configuring a PinChange interrupt on the RX UART pin and see if it can handle both UART RX and PCI at the same time to detect the Start-Bit edge.

My plan for may years has been to move to a LocoNet2 library that uses the hardware UARTs to relieve the CPU load and also support newer chipsets that have nice features for detecting a busy LocoNet to better avoid collisions.

That would be great. I don't know if I can help (I can program, but my understanding of AVR is in it's infancy).

Trouble is people keep adding to the old LocoNet library and the LocoNet2 library doesn’t get any progress.

Alex

habazut commented 3 years ago

True, after RX has been complete it's too late. If pin change can not be connected to RX it should be possible to use 2 pins, one UART RX pin and one for the interrupt at start bit which prevents transmission start. If we send with the UART as well, the RX code should hear the bytes transmitted by TX and then compare the RX and the TX to determine if there was a collision or not. (just some brainstorming here).

Trouble is people keep adding to the old LocoNet library and the LocoNet2 library doesn’t get any progress.

I did not even know LocoNet2 existed and the main/master is empty, so it's difficult to find. So how is the status of LocoNet2 and should I rather try to use that?

Harald.

habazut commented 3 years ago

Om the receive end I am thinking along these lines:

Have an interrupt that sets receiving=true as soon as something happens on the LocoNet which is not from us (transmitting=false). Then high level byte receive after UART:

lnMsg* LocoNetClass::receive(void)
{
  uint8_t c;
  while (Serial3.available()) {
    c = Serial3.read();
    addByteLnBuf(&LnBuffer, c);
  }
  rec = recvLnMsg(&LnBuffer); // check for complete message
  if (rec) { receiving=false }        // if complete we are done
  return rec;
}

On the write side I'm thinking about

txBuffree = Serial.availableForWrite();
transmitting=true;
Serial3.write(txBuf, txBufLen);

and then when Serial.availableForWrite() is back at the value of txBuffree then all the bytes are out and one can

For that I think seperate rx and tx buffers would be practical. The Mega has enough RAM for that (optimize later). If equal the tx was successful. If not, figure out the backoff or so. Am I thinking into the right direction?

Harald.

kiwi64ajs commented 3 years ago

Yeah that is the right kinds of logic except it all needs to be done in an interrupt handler to handle all the error, back off timings, priority delays and retries properly - which is probably 1/4 of the library code... ;)

Regards

Alex Shepherd

On 4/04/2021, at 6:30 AM, habazut @.***> wrote:

 Om the receive end I am thinking along these lines:

Have an interrupt that sets receiving=true as soon as something happens on the LocoNet which is not from us (transmitting=false). Then high level byte receive after UART:

lnMsg* LocoNetClass::receive(void) { uint8_t c; while (Serial3.available()) { c = Serial3.read(); addByteLnBuf(&LnBuffer, c); } rec = recvLnMsg(&LnBuffer); // check for complete message if (rec) { receiving=false } // if complete we are done return rec; } On the write side I'm thinking about

txBuffree = Serial.availableForWrite(); transmitting=true; Serial3.write(txBuf, txBufLen); and then when Serial.availableForWrite() is back at the value of txBuffree then all the bytes are out and one can

set transmitting to false compare if we got back what we just did send. For that I think seperate rx and tx buffers would be practical. The Mega has enough RAM for that (optimize later). If equal the tx was successful. If not, figure out the backoff or so. Am I thinking into the right direction?

Harald.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

habazut commented 3 years ago

So where do we need the interrupts?

  1. At start of receive to block transmission (we do not want to transmit into an incoming char unless it's our own).
  2. At end of receive to read each char and if we are transmitting to compare for collission detect. At that time char should go into rec queue as well.
  3. For state changes and Retransmits?

I am for example not sure what should happen if code wants to transmit but the former transmit is still in the queue because it's been busy all the time. I should try to draw that state machine.

Harald.

kiwi64ajs commented 3 years ago

Harald, this is an example of what others have done to support the ESP32 Hardware UART:

https://github.com/positron96/LocoNet2/blob/development/src/LocoNetESP32UART.cpp <https://github.com/positron96/LocoNet2/blob/development/src/LocoNetESP32UART.cpp>

Theres some timers, interrupt handlers, a state machine and various error/timeout handling.

You can’t do this stuff in the main loop() function - that’s too far removed from the precise timing required for LocoNet TX/Rx with Collision Sensing, Prioritised TX and proper Error/Timeout handling.

It's also a lot of work to add this to the original codebase as its already got too many #ifdefs to handle the various additions over the years - its time to move to a better structure that can add new functionality without the hackery required now.

HTH

Alex

On 5/04/2021, at 6:57 AM, habazut @.***> wrote:

So where do we need the interrupts?

At start of receive to block transmission (we do not want to transmit into an incoming char unless it's our own). At end of receive to read each char and if we are transmitting to compare for collission detect. At that time char should go into rec queue as well. For state changes and Retransmits? I am for example not sure what should happen if code wants to transmit but the former transmit is still in the queue because it's been busy all the time. I should try to draw that state machine.

Harald.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mrrwa/LocoNet/issues/23#issuecomment-813083019, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5Y53OSDRUTB6277ARGTCDTHCZCFANCNFSM4Z6AK2YA.