Open chrissnow opened 3 years ago
We have confirmed that Nucleo-L152RE board + SX1262MB2xAS shield has the same problem
using mbed-os-example-lorawan
"target.macros_add": ["MBED_TICKLESS=1"],
"events.use-lowpower-timer-ticker": true,
We agree that enabling TICKLESS with STM32L1 is not recommended.
We agree that enabling TICKLESS with STM32L1 is not recommended.
That's rather bad news for us, any particular reason and anything we can do make it work better? it seems to nearly work..
@jeromecoutant Done a bit more testing with a WL55JC1 and enabling tickless on that also breaks things, becomes very unreliable to join and downlink.
I'd hope that tickless should work on a WL55? The fault seems common across multiple families.
STM32WL is tickless by default https://github.com/ARMmbed/mbed-os/blob/master/targets/targets.json#L4236
Interesting, perhaps it's
"events.use-lowpower-timer-ticker": true
causing the trouble then.
I will try without it.
Thank you for raising this detailed GitHub issue. I am now notifying our internal issue triagers. Internal Jira reference: https://jira.arm.com/browse/IOTOSM-3696
Something very odd going on here.. I thought I had the WL55 working well, but I'm not convinced anymore.
At the moment tickless or not I can't get the WL55 reliable, my only change to the LoRaWAN example is to add some keys and make each downlink confirmed (every 10 seconds)
Our custom target seems more reliable without tickless, but I'm not certain of it.
Still testing and will report back what I can, it may not be STM related in the end but it's the only targets I can easily run LoRaWAN on.
It seems that #11502 is the true cause, tickless or not the WL55 is unusable without
"lora.max-sys-rx-error": 200
Which is really rather wasteful energy wise. I can't believe that we need to give 200ms either side of the expected RX window to be able to reliably get a downlink.
@jeromecoutant @0xc0170 I can spend some time on this early next week but I'm not really sure how best to debug it, or exactly how it's meant to work in the first place...
I'm not sure if this is STM32 specific at the moment.
Without fixing this LoRaWAN support in Mbed is pretty much unusable, The workaround isn't really suitable for production.
the WL55 is not unusable without "lora.max-sys-rx-error": 200
@ludoch-stm Maybe you could have some idea ? Thx
Having thought about this a bit more if #11502 is correct regarding it being SF dependant it's perhaps not the timing of the RX window opening that's the problem, given it's the same for all SF, however what is different is how long to leave it in RX (or wait for it to complete), could the stack be giving up mid way through successfully receiving the data?
I will try and get some timing data off a logic analyser.
@ludoch-stm Apologies to chase, Any help would be greatly appreciated, 2 days of debugging and not made any progress :-(
I have made some progress in debugging the problem.
Build configurations, all with tickless +use-lowpower-timer-ticker, though without makes no difference.
xDot_L151CC, internal SX1272, works perfectly. NUCLEO_L152RE + SX1272MB2xAS, works perfectly. NUCLEO_F446RE + SX1272MB2xAS, works perfectly.
NUCLEO_L152RE + SX126xMB2xAS, SF7 rarely works. SF8 & SF9 works sometimes, SF10 is reliable, increasing max-sys-rx-error makes things worse. NUCLEO_WL55JC1, similar to L152, but max-sys-rx-error 200 makes it reliable. NUCLEO_F446RE + SX126xMB2xAS, works perfectly.
Based on this I think there are multiple issues. Something is different between how both radios work. something is wrong with the WL55.
mbed app attached for how I tested it, you will need to add keys, only change to the example is to send confirmed.
retcode = lorawan.send(MBED_CONF_LORA_APP_PORT, tx_buffer, packet_len,
MSG_CONFIRMED_FLAG);
@0xc0170 are you able to get any support from whoever is responsible for the LoRa drivers?
Hi Chris,
This rxerror parameter is a very sensitive parameter which is dependent of radio shield and affects the RX timing window opening, as you said previously. To understand its effect, you can find attached the drawing concerning the Window Timeout and Window Offset definitions. If it’s configured to a too high value, RX window could overlap Tx and/or RX2 windows, leading to unexpected behavior. I see you are using a STM32WL with US regional parameters and SF7 to SF10 configs. Could you describe your setup if I miss some other configs? On mbed-OS, STM32WL has been validated with max-sys-rx-error = 5, and on STM32CubeWL package, it is validated with value=10 in the LoRa stack. So, setting this value equal to 200 seems really high.
The issue could come from several causes:
Hi,
My early findings might be a bit confusing, I will try and clear a few things up as we have been testing multiple regions, and multiple targets and radios.
Let's simplify it a bit!
WL55JC mbed-os-example-lorawan EU868 TTN as network
If I build and change the messages to always be confirmed once the SF lowers the downlinks are no longer received. However max-sys-rx-error = 10 is enough to make the WL55 reliable
So that is an easy fix for the WL55.
STM32L1
We have a custom board, that is in production but waiting on a firmware release (WL55JC didn't exist at the time, we will move to it later in the year) I will try finer increments of max-sys-rx-error and see if I can get it to work. But the odd thing is the SX1272 works perfectly, which really only leaves the SX126X driver since the timing is done outside that.
We see a timeout IRQ even when we have a large max-sys-rx-error, which I think is because despite the timeout in the RX command being set to forever it will still timeout on a number of symbol times?
I will get some logic traces and report back.
Thanks for that doc, explained much better than the other docs I have.
@ludoch-stm I have now narrowed the problem down further. max-sys-rx-error = 10 is enough to make the WL55 reliable on EU868 However it is not reliable on US915, Just building it at 20 to see if it helps.
Have you validated US915 or just EU868?
More progress...
Seems to be related to
MBED_CONF_LORA_DOWNLINK_PREAMBLE_LENGTH
Which defaults to 5, However
"lora.downlink-preamble-length": 9
Fixes US915
8 also seems fine. Not entirely sure on the correct number though..
Things I have found suspicious
I wonder if that comment is true for the SX127X but not the SX126X? I haven't seen reference to this in the datasheet.
This change hasn't broken EU868 either.
As the issue is present in US915 band, did you check that your configuration is in Hybrid mode?
To do so, you should configure in mbed_config.h:
Also, what's your Gateway number of channel: 8 or 64?
Were using FSB2 hybrid so channel 8-15+65, 8 channel gateway. We have an FSB mask to match that, but use OTAA so the NS dictates past the join. The frequencies all look correct during operation with tracing enabled.
This seems to work well for us, WL55 or NUCLEO_L152RE + SX126xMB2xAS
"lora.max-sys-rx-error": 10,
"events.use-lowpower-timer-ticker": true,
"target.macros_add": ["MBED_TICKLESS=1"],
"lora.downlink-preamble-length": 9
OK, good news if it works now in your environment! Perhaps the topic of the conversation can be changed then :-)
Done,
Are you going to handle the WL55 max-sys-rx-error needing to be 10? Any thoughts on the correct preamble length?
@chrissnow @jeromecoutant can this be closed after the merging of 14481 ?
I would say yes...
The WL55 behaves with #14481 but not other targets, I'm pretty sure the default preamble is wrong for all SX126X targets in US915, really needs some confirmation from someone who knows more about LoRa than me...
@chrissnow Thanks I'm not alone with this issue, can you confirm on STM32WL that only
"lora.downlink-preamble-length": 9
is needed since #14481 fix rx-error ?
@hallard It's been a few years but yes I think so.
Thanks will report back we deployed a lot on EU but it's our first try in US and this downlink issues on STM32WL drove us mad
Just done some tests and looked into the code, 2 preambules length on MBED one for uplink other for downlink
looking at CMakeLists.txt for all frequencies is as follow
MBED_CONF_LORA_DOWNLINK_PREAMBLE_LENGTH=5
MBED_CONF_LORA_UPLINK_PREAMBLE_LENGTH=8
So uplink follow specification with 8
but not downlink with 5
Default compiled program for STM32WL state
So my guess that as state @chrissnow lora.downlink-preamble-length
should be aligned with uplink and set to 8
we flashed 5 devices in US with this new setting, and got their downlink first time like a charm, before some never get it.
I can do a PR, @jeromecoutant let me know if I'm doing it on all devices or just on STM32WL in /connectivity/lorawan/mbed_lib.json
I can do a PR, @jeromecoutant let me know if I'm doing it on all devices or just on STM32WL in /connectivity/lorawan/mbed_lib.json
If you have verified then DL new value only with STM32WL, maybe it is safer to update only STM32WL ?
Agree safer, even if I'm pretty sure it will improve downlink for other. Would really like to understand why this value was set to 5 instead of 8 as specification, there is for sure a reason that we ignore.
and here we go #15459
Description of defect
We are using a STM32L151CC on a custom PCB, with a SX1262 radio, its roughly based on a Nucleo-L152RE board + SX1262MB2xAS design.
Everything works reliably until we enable tickless at which point join and downlinks become unreliable, probably <50% success rate. The network is receiving and replying so it must be a timing problem, likely the RX1 and RX2 slots are not timed well enough.
I appreciate it's a bit custom but the underlying fault is within Mbed somewhere, and likely affects other targets.
I will try and reproduce it on a "normal" target.
Were already well behind schedule on the project so any help would be greatly appreciated!
Target(s) affected by this defect ?
STM32L151CC
Toolchain(s) (name and version) displaying this defect ?
ARMC6
What version of Mbed-os are you using (tag or sha) ?
mbed-os-6.9.0 Though we had the same issue with 6.5.0 too
What version(s) of tools are you using. List all that apply (E.g. mbed-cli)
mbed-cli 1.10.5
How is this defect reproduced ?
We are working out the easiest way for someone else to reproduce it, probably a Nucleo-L152RE board + SX1262MB2xAS shield.
An xDot might have the same problem but has a different radio.
We have this in our mbed_app