mcci-catena / arduino-lmic

LoraWAN-MAC-in-C library, adapted to run under the Arduino environment
https://forum.mcci.io/c/device-software/arduino-lmic/
MIT License
638 stars 208 forks source link

What should I do after EV_LINK_DEAD #872

Closed pierrot10 closed 2 years ago

pierrot10 commented 2 years ago

Good evening,

I am using the LMIC library since a while. Recently, I obvered my board is looping just after the data has been sent. I log the activities in a SD card and I oberved the following

237164437: EV_TXCOMPLETE (includes waiting for RX windows)
237172653: EV_LINK_DEAD

The code continue but next time is send the data, nothing esle happen. It's look like no event is raised up, of my script freeze.

What is the reaosn of EV_LINK_DEAD and what does it meen?

And what should I do after a EV_LINK_DEAD? Should I do that?

os_init();
 // Reset the MAC state. Session and pending data transfers will be discarded.
LMIC_reset();

if yes, should add the following (in void onEvent (ev_t ev) )?

case EV_LINK_DEAD:
      Si.sprintln(F("EV_LINK_DEAD"),2);
      os_init();
      LMIC_reset()
      break;

or should I simply reset my board (by software) to restart and run again the setup() function, like if start it?

I am using a board based on Adafruit MO I ma using LMIC 1.5 (Below, I will have another question about that version) I am usig The things network The region is EU868 (switzerland)

Finally , I have an extra question. I am still using the old LMIC library but I think it's time to use mmic-catena library

Can I simply remove the old LMIC library from my libraries folder and add the catena-lmic and my script will work as before? or there are some adaptions I have to do in my script? What's the major differences?

Many thanks for your help and advises? Cheers

terrillmoore commented 2 years ago

Hi @pierrot10,

EV_LINK_DEAD is related to link integrity checking. It means (generally) that the LMIC sent some number of uplinks without getting any downlinks.

If I recall correctly, it was somewhat buggy in the original LMIC; this was one of the areas that got modified substantially during compliance testing. In the current (v4.1) LMIC, it means,

If all the above conditions apply, EV_LINK_DEAD is sent.

Nothing else happens; the LMIC continues to handle uplinks. However, the LMIC keeps counting uplinks. After LINK_CHECK_UNJOIN unacknowledged uplinks (752) at the lowest permitted data rate, the LMIC will force a rejoin.

This can all be disabled by calling LMIC_setLinkCheckMode(0) to disable link check mode.

Again, all this is referencing the current code. I really can't say exactly what happens with other versions of the LMIC; I changed this area quite a bit to align it with the spec. The old LMIC had a non-spec-compliant "rejoin" which would send a "rejoin" command -- not part of LoRaWAN 1.0.0..1.0.3; and basically things got a little weird. I don't think that path was ever fully tested.

So what should you do? You can just ignore it; or you can decide that you want to rejoin. In a very low-power device that's very remote, you might want to go into some kind of defensive power management mode, on the assumption that something has gone wrong. The LMIC (still, sigh) is much too aggressive about transmitting on joins, so that will drain the battery quickly; so rejoining might not be what you want to do. Depends on your use case.

The v4.1 LMIC should just drop in; you may have some behavior changes (that's why I changed the major version) but the APIs are basically the same. You should be careful to check for errors after uplinks; the sample sketches are not as careful as they should be. The documentation in the doc directory is basically up to date.

Best regards, --Terry

pierrot10 commented 2 years ago

Dear @terrillmoore

Many thanks for your nice reply. I appreciate.

So what should you do? You can just ignore it; or you can decide that you want to rejoin

I checked this line LMIC_setLinkCheckMode(0); but it's already disable :(, the ignore, that's right? (I am a bit afraid to replace the 1-5 library with the 4.1 because I wish to move th node to the field today, but I willl do it later for sure, with a test devise.)

So what do you meen by ignoring it?

If I check my log, I have

58: SENDING DATA
Packet queued after: 8212ms
237164437: EV_TXCOMPLETE (includes waiting for RX windows)
237172653: EV_LINK_DEAD

Then my devise go to seep for 15mn. The devise wakeup, take the measure and send it to TTN. I can see in my log

59: SENDING DATA
Packet queued after: 8226ms

this i call just after the print of 59:SENDING DATA LMIC_setTxData2(1, payload, strlen((char*)payload), 0); // It let me send only the value without the '00 00 00'

then othing else happen. I do not know if my devise freeeze or is still waiting for a reply from onEvent as EV_TXCOMPLETE

You mentionned, I coul do a rejoing which sound a good idea, but how can I do it, without restarting my devise?

Is there a fonction to rejoin that I could add here?

case EV_LINK_DEAD:
            Serial.println(F("EV_LINK_DEAD"));
            break;

The v4.1 LMIC should just drop in; you may have some behavior change

Sure I will. But for today, I will prefer to fix that issue by rejoin or disable and in the next days, I will test a devise which stay in my place to migrate to the v4.1, which may solve the EV_LINK_DEAD issue.

That's funny because it look like I have this issue since I migrate my nodes to TTN v3.

Many thanks for your help!!!!

terrillmoore commented 2 years ago

That's funny because it look like I have this issue since I migrate my nodes to TTN v3.

Probably the root problem is that the older LMIC version doesn't really handle the MAC downlinks to change the RX window correctly. I think you are not receiving from the network at all on V3. Try printing out LMIC.rxdelay after a join. If it's 0 or 1, then you've got a broken LMIC version. The easy fix is to use the advanced features of the TTN console to change your rx1 window delay to 1 (default is 5). If you don't know how to do that, you can get help from the TTN forum users, I think. (Or google it...)

pierrot10 commented 2 years ago

Dear Terri, Thanks again for your reply. May I ask you a last quetsion, how can fo a rejoin? Is there a fonction for that?

I wish you a nice week-end

terrillmoore commented 2 years ago

Hi @pierrot10,

Forcing a rejoin in the older LMIC may be a problem, now that I think about it. The easy way to do it, I think, is LMIC_reset() -- you have to repeat any other LMIC initiailization you do during startup.

The thing that is likely to work with any version of the LMIC is: LMIC.devaddr = 0;. The next time you transmit, the LMIC will notice that devaddr is zero, and will automatically, and cleanly, trigger a join. Other approaches (LMIC_tryRejoin(), etc.) are problematic because of bugs. Maybe I'll get the time to do version 5....

Best regards, --Terry

pierrot10 commented 2 years ago

Dear Terri

Many thank. I upgraded to 4.1 and it looks working fine. At least the data are sent :) The only difference I see is an unknow event

1391692: EV_JOINING
1725722: Unknown event
2050388: EV_JOINED
netid: 19
devaddr: 2600000
AppSKey: 00-00-00-00-00-00-00-00-00-00-00-00-00-4E-91-B2
NwkSKey: 00-00-00-00-00-00-00-00-00-00-00-00-00-72-75-5E
2054347: Unknown event
2376971: EV_TXCOMPLETE (includes waiting for RX windows)

I do not know if I have to be wooried

I will leave the node runing over the night.

Many thanks

terrillmoore commented 2 years ago

Hi @pierrot10 ,

The unknown event is due to a new event being added in v3 or so. No need to be concerned.

Best regards, --Terry

pierrot10 commented 2 years ago

Thank you Terry