embassy-rs / embassy

Modern embedded framework, using Rust and async.
https://embassy.dev
Apache License 2.0
5.65k stars 785 forks source link

lora_lorawan.rs : can not join #1408

Closed davekingdoms closed 1 year ago

davekingdoms commented 1 year ago

Hi all,

I'm trying to run lora_lorawan.rs example on a nucleo-wl55jc. For the joining I use a while let statement but device.join(...) always fail and return a RxTimeout error. On TTN console I see the "Accept join-request" and "Forward join-accept message".

Am I doing something wrong? Maybe I've done an erroneus set up on TTN? Any suggestions?

lulf commented 1 year ago

@davekingdoms There might be some minor tuning needed. I've not run the new lora-phy stack that recently got merged yet, but I know others have been successful. For instance, you may need to configure the spreading factor, and for TTN possibly the rx1 delay.

See the old example here for how to do those things https://github.com/embassy-rs/embassy/blob/fb27594b2eb2cca2aea25dd92a7b730c185b6ecc/examples/stm32wl/src/bin/lorawan.rs#L69-L74

@ceekdee Other things to consider?

ceekdee commented 1 year ago

I assume region, deveui, appeui, and appkey are set appropriately since the join request seems to be getting through. If you were using an 8 channel TTN gateway, I would not expect the join to get through every time since there is an issue in rust-lorawan for such a gateway. But again, it seems that the join is getting through reliably. I did not find in my testing that:

was necessary for TTN, since it seemed to do that automatically, but it is something to try.

Does the TTN gateway log show any response from the gateway to the device from the service? For example, a new power setting in the join accept?

My current assumption is that receive window timings coded in rust-lorawan for the stm32wl may be the issue. But before considering that, could you run the following trace to see if anything else turns up?

for better information display.

to obtain one level of debugging information and paste the result here. Thanks!

ceekdee commented 1 year ago

Also, in my testing deveui had to be specified in the array as lsb, not msb. The TTN console for the end device allows you to get the format for both lsb and msb.

Having reviewed a description of the lsb/msb issue in the LoRaWAN group matrix chat, I now believe this may be the issue if the TTN log does not show a join accept entry. I am not sure about the appeui lsb/msb setting, since in my case it was all zeroes. appkey should be coded as shown in the TTN console.

davekingdoms commented 1 year ago

Thanks for the replies.

@lulf

For instance, you may need to configure the spreading factor, and for TTN possibly the rx1 delay.

See the old example here for how to do those things

https://github.com/embassy-rs/embassy/blob/fb27594b2eb2cca2aea25dd92a7b730c185b6ecc/examples/stm32wl/src/bin/lorawan.rs#L69-L74

My fault. I completely forgot to set the rx1 delay. But unfortunately nothing has changed.

@ceekdee

If you were using an 8 channel TTN gateway, I would not expect the join to get through every time since there is an issue in rust-lorawan for such a gateway.

Yep, it is a RG-186-SentriusGateway and it is an 8 channel TTN gateway.

Does the TTN gateway log show any response from the gateway to the device from the service? For example, a new power setting in the join accept?

Yep, on the gateway console there is a "Send downlink message" (Tx Power 16.15 Data rate SF12BW125).

to obtain one level of debugging information and paste the result here. Thanks!

0.000000 DEBUG rcc: Clocks { sys: Hertz(16000000), apb1: Hertz(16000000), apb1_tim: Hertz(16000000), apb2: Hertz(16000000), apb2_tim: Hertz(16000000), apb3: Hertz(16000000), ahb1: Hertz(16000000), ahb2: Hertz(16000000), ahb3: Hertz(16000000) }

0.002716 DEBUG tx power = 0

0.056732 INFO  Joining LoRaWAN network

0.064727 DEBUG sf = 12, bw = 4, cr = 1

0.065826 DEBUG tx power = 22

0.079223 DEBUG channel = 868300000

0.091705 DEBUG process_irq loop entered

1.571899 DEBUG process_irq satisfied: irq_flags = 0x1 in radio mode Transmit

1.572174 DEBUG TxDone in radio mode Transmit

6.522644 DEBUG sf = 12, bw = 4, cr = 1

6.523498 DEBUG channel = 868300000

6.536224 DEBUG process_irq loop entered

6.536682 DEBUG process_irq satisfied: irq_flags = 0x0 in radio mode Receive

6.536926 DEBUG process_irq loop entered

7.522766 DEBUG sf = 12, bw = 4, cr = 1

7.523651 DEBUG channel = 869525000

7.536346 DEBUG process_irq loop entered

8.586608 ERROR Radio error = RxTimeout

If I try to join using a while let statement this log repeats endlessly (on every loop CHANNEL changes between 868300000, 868100000 and 868500000).

Having reviewed a description of the lsb/msb issue in the LoRaWAN group matrix chat, I now believe this may be the issue if the TTN log does not show a join accept entry. I am not sure about the appeui lsb/msb setting, since in my case it was all zeroes. appkey should be coded as shown in the TTN console.

Appeui, Deveui and Appkey are are set up correctly. If they weren't, I wouldn't see anything on Application console. I use a .reverse() to change the bit numbering. I did not know that TTN allows me to get the format for both lsb and msb.

Thank you very much for helping me.

ceekdee commented 1 year ago

The LoRaWAN configuration for join request and join accept is set up by rust-lorawan in this test case. The trace indicates transmission of join request on 868300000, then for join accept the first receive window at 868300000 and the second at 869525000, all at bandwidth 4 (125 KHz) and coding rate 1 (4/5).

This matches the TTN regional specs for EU868: https://www.thethingsnetwork.org/docs/lorawan/regional-parameters/

Unfortunately, this means the issue is deeper, and I cannot test EU868 for TTN since I am in US915. To debug this issue, it would be necessary to:

embassy-lora/Cargo.toml

lorawan-device = { version = "0.10.0", path ="../../rust-lorawan/device", default-features = false, features = ["async"] }

examples/stm32wl/Cargo.toml

lorawan-device = { version = "0.10.0", path ="../../../rust-lorawan/device", default-features = false, features = ["async", "external-lora-phy"] } lorawan = { version = "0.7.3", path ="../../../rust-lorawan/encoding", default-features = false, features = ["default-crypto"] }

With that in place, there are three separate updates to try in rust-lorawan in device/src/async_device/lora_radio.rs.

This is a lot of help to request, and I apologize for that. I simply cannot test it myself. If none of those updates work, there are more tweaks to try, but the ones above are a good start. I can test the stm32wl55 successfully with TTN in US915, so at least there is hope.

lucasgranberg commented 1 year ago

The problem is that the RX windows are not open long enough. Time on air with SF12 for RX2 is 1.8s and the window is hard coded to ca. 1s. Hard coding rx2 to 2s makes it work but it is not a good solution. Some code needs to be added to extend the timeout if there is data coming in.

lucasgranberg commented 1 year ago

I did this ugly hack to make it work inside rust-lorawan/device. Needs some Cargo.toml chenanigans to make it include the right stuff.

diff --git a/device/src/async_device/mod.rs b/device/src/async_device/mod.rs
index 51ea768..af3d520 100644
--- a/device/src/async_device/mod.rs
+++ b/device/src/async_device/mod.rs
@@ -521,7 +521,7 @@ where
         {
             // Prepare for RX using correct configuration
             let rx_config = self.region.get_rx_config(self.datarate, frame, &Window::_2);
-            let window_duration = self.phy.radio.get_rx_window_duration_ms();
+            let window_duration = self.phy.radio.get_rx_window_duration_ms()+2000;

             // Pass the full radio buffer slice to RX
             let rx_fut = self
ceekdee commented 1 year ago

@davekingdoms : there is significant work underway in the community for LoRaWAN using rust beyond what is evident in the issues/PRs within the rust-lorawan, embassy, and lora-phy repositories. If you would like to obtain more context regarding this work, there is an active chat on the topic. See the link at the end of the lora-phy README:

https://github.com/embassy-rs/lora-phy

davekingdoms commented 1 year ago

In these days I did some tests following your advice. Increasing the get_rx_window_duration_ms to 3000 it works. This is how I coded the joning attemps:

    while let Err(err) = device.join(&JoinMode::OTAA {deveui, appeui, appkey,}).await{

        match err {
                    lorawan_device::async_device::Error::Radio(_) => defmt::error!("Join failed: Radio"),
                    lorawan_device::async_device::Error::NetworkNotJoined => {defmt::error!("Join failed: NetworkNotJoined")}
                    lorawan_device::async_device::Error::UnableToPreparePayload(_) => {defmt::error!("Join failed: UnableToPreparePayload")}
                    lorawan_device::async_device::Error::InvalidDevAddr => {defmt::error!("Join failed: InvalidDevAddr")}
                    lorawan_device::async_device::Error::RxTimeout => {defmt::error!("Join failed: RxTimeout")}
                    lorawan_device::async_device::Error::SessionExpired => {defmt::error!("Join failed: SessionExpired")}
                    lorawan_device::async_device::Error::InvalidMic => {defmt::error!("Join failed: InvalidMic")}
                    lorawan_device::async_device::Error::UnableToDecodePayload(_) => {defmt::error!("Join failed: UnableToDecodePayload")}
            }  
            Timer::after(Duration::from_millis(20000)).await;
}

It doesn't always connect on the first try. Most of the time it connects after two/three tries (≈40 seconds). In the worst case it connects after 9 minutes. It's not quite fixed, but it's definitely better than it was before.

I just saw this PR in rust-lorawan. Maybe the issue is related to this.

After the testing phase I will improve the delay between attempts, in according to this documentation.

For example, it will be something like this:

 let mut time = 15000;

 while let Err(err) = device.join(&JoinMode::OTAA {deveui, appeui, appkey,}).await{
.
.
.
 Timer::after(Duration::from_millis(time)).await;
    if time < 3840000{
        time = time*2;
    }
    else {
        time = 15000;
     }
}

@davekingdoms : there is significant work underway in the community for LoRaWAN using rust beyond what is evident in the issues/PRs within the rust-lorawan, embassy, and lora-phy repositories. If you would like to obtain more context regarding this work, there is an active chat on the topic. See the link at the end of the lora-phy README:

https://github.com/embassy-rs/lora-phy

Thanks @ceekdee, I'll check it out.

I think it's more useful to move the conversation to the right repository (rust-lorawan and lora-phy).

Thank you all!

ceekdee commented 1 year ago

@davekingdoms : thank you for your observations above. This issue is being closed here since the two issues identified:

are pertinent to the LoRaWAN layer (rust-lorawan in this case).

ilya-epifanov commented 1 year ago

Fixed in https://github.com/ivajloip/rust-lorawan/pull/142 and https://github.com/embassy-rs/lora-phy/pull/33