mcci-catena / arduino-lmic

LoraWAN-MAC-in-C library, adapted to run under the Arduino environment
https://forum.mcci.io/c/device-software/arduino-lmic/
MIT License
634 stars 208 forks source link

EV_TXCOMPLETE in unsupervised transmission #928

Open kalwinskidawid opened 1 year ago

kalwinskidawid commented 1 year ago

Hi, I have a rather simple question. If I have an unattended TX transmission should I get an EV_TXCOMPLETE after sending the signal to the server? At the moment when I try to send (LMIC_setTxData2) a TX signal to the server I only get the EV_TXSTART event from the module, and TXCOMPLETE is missing.

Is it possible to disable the LoRa module when the board is asleep (for about 15min) ?

Board: ARDUINO UNO + M5 STACK Board with LoRa module (SX1276) Region: EU868

terrillmoore commented 1 year ago

If you're having problems with sleep, please check the discussion on #926 - we recently have found a problem having to do with internal state and sleeping. The problem will be worse on EU868, because regulatory requirements force it to do exactly the kind of thing that triggers the problem (setting a re-transmit time in the future).

kalwinskidawid commented 1 year ago

Thanks @terrillmoore! I haven't noticed a problem with a sleep at this moment. Right now I use Narcoleptic.delay(<X ms>) to sleep the microcontroller and after that I restart a program to start from address 0. Is it currently possible to put the LoRa module to sleep or does it automatically go into a deep sleep state after sending a message to the server?

Btw. Do you know if I should get an EV_TXCOMPLETE after sending an unsupervised TX?

terrillmoore commented 1 year ago

You should always get an EV_TXCOMPLETE, provided you are polling the LMIC (calling the run routine from your loop() function, directly or indirectly.

The LMIC automatically puts the SX1276 etc in the lowest power state after transmission completes.

BTW, you need to ensure that the DIO pin that signals TX complete is wired up. Depending on your board, this is not a given.

kalwinskidawid commented 1 year ago

I checked DIO[0-2]. Pins between the Arduino and the M5 board looks okay. My config:

 const lmic_pinmap lmic_pins = {
    .nss = 10,
    .rxtx = LMIC_UNUSED_PIN,
    .rst = LMIC_UNUSED_PIN,
    .dio = {2, 3, 4}, //2,3,4
}; 

So if I have signal sending in the setup() function, then run loop once where is sleep and reset of the program and signal sending is not in the loop, then I don't get EV_TXCOMPLETE?

terrillmoore commented 1 year ago

You must call the run loop continuously until you get EV_TXCOMPLETE. (Actually, until the LMIC is idle, which may take longer.) There's an API that tell you whether it's safe to sleep for a given period of time (you tell it how long you want to sleep). If it returns false, you should continue to call the run loop. Once it returns true, it's safe to skip calling the run loop for the interval specified (but see #926 for a workaround that's currently required).

kalwinskidawid commented 1 year ago

Okay, thanks! I'll check it today. Could you tell me if there is a method that returns information about LoRa and current state in LMIC?

kalwinskidawid commented 1 year ago

Thank you so much @terrillmoore. I put os_runloop_once() and now I get TXCOMPLETE but now I have bigger problems, I did a calculation for myself between send and TXCOMPLETE and it comes out about 2.7s which is a very poor result, and I will say that I am sending only 24 bytes. Is this normal?

terrillmoore commented 1 year ago

Is this normal? Yes. For an unconfirmed uplink with "number of repeats" == 1, TXCOMPLETE happens after (0) any pre-transmit delay occurs, (1) the uplink is complete and either (2a) RX1 window occurs with a successful downlink, or (2b) RX1 window times out, and RX2 windows completes (either with data or without). The timing of RX1 and RX2 depend on the network. The minimum is one second after completing the uplink. So 2.7 seconds looks like some pre-transmit delay, plus uplink time, plus RX1 (one second from end of uplink) plus RX2 (two seconds from end of uplink). Totally believable. If using TTN in the US, the default is actually more like 5 seconds, so the total will be 10+ seconds, minimum.

kalwinskidawid commented 1 year ago

I'm using eu868 and I thought the transmission would last about <500ms. I have a feeling that I don't understand something why we get (wait for) RX1 / RX2 when we use a unconfimed uplink? Before my TXCOMPLETE problem, my program after LMIC_setTxData2 goes to sleep and it takes much less time than now and I also receive packets on gateway, but I'm not sure if that would be a good solution

terrillmoore commented 1 year ago

"Because it's LoRaWAN". Class A devices are required to open a receive downlink window during RX1 and RX2, in case the network wants to send control messages. This is not optional, and TTN and other networks require that.

Also, because of duty-cycle management in EU868, the time can be even longer -- you have to have an available channel; and SF12 messages can be very slow indeed to transmit in the uplink direction. This is not as much of an issue in other regions.

Furthermore, to complicate matters further: many downlink MAC messages require an acknowledgement. The LMIC will send the ack right away, but this is also a class A uplink, and so on the first message after a JOIN (or the message that causes a JOIN), you may find that you're exchanging several messages with the network, when you thought you were only sending one. You really have to use the LMIC APIs to find out whether it's busy before you sleep and you have to let it do its thing, otherwise you'll have problems. And see #926, which is a further complication.

kalwinskidawid commented 1 year ago

Thanks @terrillmoore! I get it that class A works as works, for me it's quite weird because I thought that it send only unconfirmed data and don't wait for any response from server but I was wrong. So in my case when I want send unconfirmed data and I don't expect that server response on my message then I can skip waiting for RX1 / RX2 (they just extend my operating time without any reason - This way I can prolong the sensor's operation, because it will be in high state for a shorter time). LMIC_setTxData2 works synchronous so after call them I can put my whole Arduino in deep sleep and I also I will save more energy than if I still wait from 1-1.5s behind until my connection is correctly completed via TXCOMPLETE, right?

terrillmoore commented 1 year ago

Short answer: don't do that. The battery saving is minimal (really -- I've calculated it out -- the power is dominated by TX on time on any microprocessor, by a factor of 20 to 1; and you only do this after a TX).

The social reason not to do it is that you make the network unhappy and work worse for everyone else, especially in EU. Any LoRaWAN network will try to send you parameters from time to time; and if you don't ack, it will simply retry for quite a while. This will consume downlink airtime; and since downlink shares channels with uplink in EU, this will interfere with others sending data. In addition, the gateway uses up its available duty cycle trying to talk to you, meaning it can't send messages to others who are being good citizens, trying to join, whatever.

The practical reason is that the LMIC will fail in odd and obscure ways; this is not a supported approach, and because of the LMIC's design, it's not possible to change. It will do this at random times, manifesting as strange uplink problems. The LMIC is not a clean FSM with easy-to-analyze behavior; after five years of working with it, it continues to surprise. It's not bad; but its behavior is focused on a specific use case.

So my recommendation is: either live with LoRaWAN behavior, or switch to a different library that will support your use case, because the LMIC has this assumption hard-wired throughout.

kalwinskidawid commented 1 year ago

Thank you for such a detailed answer! I have two more question, maybe stupid, but I can't figure it out. When I have SF7 and antenna in range of ~10 meters the signal including TX_COMPLETE comes in about 4.5s or 2.5s (that's also odd, one time took around 4s and after rebuild again app without any changes it takes 2s) (RSSI -39, payload length: 23bytes + probably CRC which is added by default), but when I change to SF12 it suddenly takes almost 167s to send the signal, I checked on air-time calculators and everywhere it was max 2s, not some sick 3 minutes. What could be the reason?

How to get debug information? I set DEBUG_LEVEL but I don't get any additional information. On the arduino I use Serial.print

#define LMIC_DEBUG_LEVEL = 2
#define LMIC_PRINTF_TO Serial
terrillmoore commented 1 year ago

This is most likely due to EU868 duty cycle; you are sending such long messages that you must wait after each one before sending the next. SF7 is very short (relative to these calculations).

The debug prints are a legacy of the original code. They ruin timing and so they should only be used if debugging code inside the LMIC (i.e., trying to add features). They are not intended for use by library users. The LMIC really needs a proper logging system so users could understand what's going on, but that requires restructuring it so that the LMIC itself has an idea of what's going on. Maybe in V5. I tried adding class C and had to suspend the effort because there are too many flags and odd state variables; although I am sure I could debug it, the time would be better spent making an explicit finite-state machine. The code would be smaller, too.

kalwinskidawid commented 1 year ago

You are probably right @terrillmoore. The message is quite long, but if I connect four sensors to the device, then my message is quite long bcecause I need add some information about readings. I wonder why: the social reason not to do it is that you make the network unhappy and work worse for everyone else, especially in the EU. After all, Class A sends a TX and waits two intervals behind the RX (which I don't care about anyway) so I won't get any return either way, and keeping all the sensors for 4 seconds instead of 0.5s makes a big difference