Closed kYc0o closed 5 years ago
Maybe @aabadie or @fjmolinas have more insight.
I think I found the reason... After erasing the whole EEPROM the test behaves more or less correctly, although I get always a kind of lost interrupt:
2019-06-04 14:52:21,151 - INFO # �main(): This is RIOT! (Version: 2019.04-devel-1392-gfcd7f2)
2019-06-04 14:52:21,154 - INFO # [sx127x] netdev: sx127x_on_dio1: unknown state
2019-06-04 14:52:21,155 - INFO # All up, running the shell now
I'm still investigating the real cause.
@kYc0o since lobaro-lorabox is similar to im880b cpu wise, could you see if #11314 fixes your issue? If not I'll try to reproduce.
Unfortunately it doesn't solve the problem. I have some weird behaviour on this lobaro board. I tested my application rebased to current master on a b-l072z-lrwan1 and it works correctly. There's something specific on that board which is not compatible with what it was done on #11552.
@kYc0o I'll try to reproduce on IM880b tomorrow, it's the closes hardware I have to lobaro.
Also, does the application work if you don't provide eeprom?
Thanks for the advice @fjmolinas ! It actually worked if I disable the eeprom. I'll investigate from there.
@kYc0o I was able to reproduce your issue, although it only appears after the first loramac save
.
I can fix the issue by initiating uint8_t dr = 0
at:
no idea why... I'm guessing some kind of optimization problem
Thanks for the pointer! Indeed that solves the hardfault problem, however, I have the impression that reading and saving data into EEPROM might take some time on some "slow" devices, like the stm32l151, or I'm missing something, since the device doesn't get downlink messages.
With periph_eeprom disabled, the device behaves correctly and sends the confirmable messages without any problem, but when it's enabled the device is unable to receive the ACK:
2019-06-05 11:50:42,538 - INFO # �[semtech-loramac] initializing loramac
2019-06-05 11:50:42,539 - INFO # [semtech-loramac] reading configuration from EEPROM
2019-06-05 11:50:42,540 - INFO # [semtech-loramac] reading uplink counter: 0
2019-06-05 11:50:42,541 - INFO # [semtech-loramac] reading rx2 freq: 869525000
2019-06-05 11:50:42,542 - INFO # [semtech-loramac] reading rx2 dr: 0
2019-06-05 11:50:42,543 - INFO # [semtech-loramac] set join state ? 1
2019-06-05 11:50:42,545 - INFO # main(): This is RIOT! (Version: 2019.04-devel-1399-g8fa40-lora_cayennelpp)
2019-06-05 11:50:42,546 - INFO # [semtech-loramac] initializing loramac
2019-06-05 11:50:42,547 - INFO # [semtech-loramac] reading configuration from EEPROM
2019-06-05 11:50:42,547 - INFO # [semtech-loramac] reading uplink counter: 0
2019-06-05 11:50:42,549 - INFO # [semtech-loramac] reading rx2 freq: 869525000
2019-06-05 11:50:42,550 - INFO # [semtech-loramac] reading rx2 dr: 0
2019-06-05 11:50:42,551 - INFO # [semtech-loramac] set join state ? 1
2019-06-05 11:50:42,552 - INFO # [semtech-loramac] Starting join procedure: 0
2019-06-05 11:50:42,552 - INFO # [semtech-loramac] network is already joined
2019-06-05 11:50:42,553 - INFO # Current Time: 2019-07-03 17:56:00
2019-06-05 11:50:42,554 - INFO # Next Alarm: 2019-07-03 17:59:00
2019-06-05 11:53:41,949 - INFO # Current Alarm: 2019-07-03 17:59:00
2019-06-05 11:53:41,952 - INFO # Current time: 2019-07-03 17:59:00
2019-06-05 11:53:41,957 - INFO # Next Alarm: 2019-07-03 18:02:00
2019-06-05 11:53:41,958 - INFO # Sending temperature...
2019-06-05 11:53:42,720 - INFO # [semtech-loramac] loramac cmd msg
2019-06-05 11:53:42,723 - INFO # [semtech-loramac] send frame g
2019-06-05 11:53:42,728 - INFO # [semtech-loramac] MCPS request: confirmed TX
2019-06-05 11:53:42,735 - INFO # [semtech-loramac] MCPS request: OK
2019-06-05 11:53:43,737 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:53:44,055 - INFO # [semtech-loramac] Transmission completed
2019-06-05 11:53:44,744 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:53:44,861 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:53:45,522 - INFO # [semtech-loramac] RX timer timeout
2019-06-05 11:53:45,751 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:53:45,865 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:53:46,525 - INFO # [semtech-loramac] RX timer timeout
2019-06-05 11:53:46,757 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:53:47,658 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:53:47,764 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:53:48,771 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:53:49,777 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:53:50,784 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:53:51,790 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:53:52,797 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:53:53,803 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:53:54,810 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:53:55,816 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:53:56,823 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:53:57,829 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:53:58,836 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:53:59,842 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 11:54:00,849 - INFO # [semtech-loramac] MAC timer timeout
.
.
.
It just keeps the timeout indefinitely. Did you experience something similar?
Did you experience something similar?
Are you using ABP activation mode ? Are you using TTN network ? If yes to both, then this means you must set RX2 datarate to 3. Otherwise the RX2 frame might miss messages sent by the server.
Are you using ABP activation mode ? Are you using TTN network ?
No to both, I'm using my own gateway with a RPi and an iC880a concentrator. I'll try anyways to set the RX2 datarate to 3 to see what happens. What attire my attention is that this is not needed when the EEPROM operations don't take place, thus it seems to me more a timing problem. I also noticed that such operations are faster on a b-l072z-lrwan1, due to its faster processor.
@kYc0o I usually avoid using DEBUG messages here, it can mess up the timing of the windows. For examples using IM880b I'm not able to successfully join when DEBUG messages are enabled (the same ones you have), the message is delivered but the downlinnk isnt captured. Can you try without DEBUG?
This is a completed transmission with confirmable messages reaching my server:
2019-06-05 12:06:17,237 - INFO # �[semtech-loramac] initializing loramac
2019-06-05 12:06:17,238 - INFO # main(): This is RIOT! (Version: 2019.04-devel-1399-g8fa40-lora_cayennelpp)
2019-06-05 12:06:17,239 - INFO # [semtech-loramac] initializing loramac
2019-06-05 12:06:17,240 - INFO # [semtech-loramac] Starting join procedure: 0
2019-06-05 12:06:17,241 - INFO # [semtech-loramac] loramac cmd msg
2019-06-05 12:06:17,242 - INFO # [semtech-loramac] starting OTAA join
2019-06-05 12:06:17,666 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 12:06:18,148 - INFO # [semtech-loramac] Transmission completed
2019-06-05 12:06:18,673 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 12:06:19,680 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 12:06:20,687 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 12:06:21,694 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 12:06:22,701 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 12:06:22,969 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 12:06:23,708 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 12:06:23,842 - INFO # [semtech-loramac] unexpected netdev event received: 1
2019-06-05 12:06:23,972 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 12:06:24,496 - INFO # [semtech-loramac] MLME confirm event
2019-06-05 12:06:24,503 - INFO # [semtech-loramac] MLME confirm msg received
2019-06-05 12:06:24,504 - INFO # [semtech-loramac] join succeeded
2019-06-05 12:06:24,506 - INFO # Current Time: 2019-07-03 17:56:00
2019-06-05 12:06:24,509 - INFO # Next Alarm: 2019-07-03 17:59:07
2019-06-05 12:06:24,513 - INFO # [semtech-loramac] Starting join procedure: 0
2019-06-05 12:06:24,517 - INFO # [semtech-loramac] network is already joined
2019-06-05 12:06:24,520 - INFO # Current Time: 2019-07-03 17:56:00
2019-06-05 12:06:24,524 - INFO # Next Alarm: 2019-07-03 17:59:07
2019-06-05 12:09:24,505 - INFO # Current Alarm: 2019-07-03 17:59:07
2019-06-05 12:09:24,509 - INFO # Current time: 2019-07-03 17:59:07
2019-06-05 12:09:24,514 - INFO # Next Alarm: 2019-07-03 18:02:07
2019-06-05 12:09:24,515 - INFO # Sending temperature...
2019-06-05 12:09:25,276 - INFO # [semtech-loramac] loramac cmd msg
2019-06-05 12:09:25,279 - INFO # [semtech-loramac] send frame g
2019-06-05 12:09:25,284 - INFO # [semtech-loramac] MCPS request: confirmed TX
2019-06-05 12:09:25,291 - INFO # [semtech-loramac] MCPS request: OK
2019-06-05 12:09:26,292 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 12:09:26,611 - INFO # [semtech-loramac] Transmission completed
2019-06-05 12:09:27,299 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 12:09:27,416 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 12:09:28,303 - INFO # [semtech-loramac] unexpected netdev event received: 1
2019-06-05 12:09:28,306 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 12:09:28,419 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 12:09:28,795 - INFO # [semtech-loramac] MCPS confirm event
2019-06-05 12:09:28,797 - INFO # [semtech-loramac] MCPS indication event
2019-06-05 12:09:28,802 - INFO # [semtech-loramac] MCPS confirm msg received
2019-06-05 12:09:28,805 - INFO # [semtech-loramac] MCPS confirm event OK
2019-06-05 12:09:28,809 - INFO # [semtech-loramac] MCPS confirm event: CONFIRMED
2019-06-05 12:09:28,815 - INFO # [semtech-loramac] forward TX status to sender thread
2019-06-05 12:09:28,819 - INFO # [semtech-loramac] MCPS indication msg received
2019-06-05 12:09:28,822 - INFO # [semtech-loramac] MCPS indication Unconfirmed
2019-06-05 12:09:28,826 - INFO # [semtech-loramac] MCPS indication: ACK received
2019-06-05 12:09:28,833 - INFO # [semtech-loramac] received something
2019-06-05 12:09:28,834 - INFO # Received ACK from network
Can you try without DEBUG?
As you see with DEBUG things still working. Using periph_eeprom fails with or without DEBUG. I assume it keeps looping on the timeout anyways.
@kYc0o Regarding the confirmed message problem I'm able to get confirmed messages on im880b:
> loramajoin otaa
2019-06-05 12:37:13,893 - INFO # loramac join otaa
2019-06-05 12:37:23,257 - INFO # Join procedure succeeded!
> loramac tx asdf cnf
2019-06-05 12:37:31,164 - INFO # loramac tx asdf cnf
2019-06-05 12:37:34,662 - INFO # Received ACK from network
2019-06-05 12:37:34,664 - INFO # Message sent with success
And also with debug messages:
loramatx asdf cnf
2019-06-05 12:40:12,575 - INFO # loramac tx asdf cnf
2019-06-05 12:40:12,578 - INFO # [semtech-loramac] loramac cmd msg
2019-06-05 12:40:12,580 - INFO # [semtech-loramac] send frame asdf
2019-06-05 12:40:12,584 - INFO # [semtech-loramac] MCPS request: confirmed TX
2019-06-05 12:40:12,592 - INFO # [semtech-loramac] MCPS request: OK
2019-06-05 12:40:13,591 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 12:40:13,912 - INFO # [semtech-loramac] Transmission completed
2019-06-05 12:40:14,594 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 12:40:14,716 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 12:40:15,376 - INFO # [semtech-loramac] RX timer timeout
2019-06-05 12:40:15,597 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 12:40:15,845 - INFO # [semtech-loramac] MAC timer timeout
2019-06-05 12:40:15,999 - INFO # [semtech-loramac] unexpected netdev event received: 1
2019-06-05 12:40:16,060 - INFO # [semtech-loramac] MCPS confirm event
2019-06-05 12:40:16,064 - INFO # [semtech-loramac] MCPS indication event
2019-06-05 12:40:16,067 - INFO # [semtech-loramac] MCPS confirm msg received
2019-06-05 12:40:16,071 - INFO # [semtech-loramac] MCPS confirm event OK
2019-06-05 12:40:16,075 - INFO # [semtech-loramac] saving uplink counter: 1
2019-06-05 12:40:16,106 - INFO # [semtech-loramac] MCPS confirm event: CONFIRMED
2019-06-05 12:40:16,110 - INFO # [semtech-loramac] forward TX status to sender thread
2019-06-05 12:40:16,114 - INFO # [semtech-loramac] MCPS indication msg received
2019-06-05 12:40:16,119 - INFO # [semtech-loramac] MCPS indication Unconfirmed
2019-06-05 12:40:16,123 - INFO # [semtech-loramac] MCPS indication: ACK received
2019-06-05 12:40:16,126 - INFO # [semtech-loramac] received something
2019-06-05 12:40:16,128 - INFO # Received ACK from network
2019-06-05 12:40:16,131 - INFO # Message sent with success
Anyway regarding the optimization issue causing the HARDFAULT, I have been trying to find the reason for it but I can't. Initializing the variables fixes the issue but I don't know why, any ideas @aabadie? A fix could be just initializing them all.
/* Read RX2 datarate */
uint8_t dr = 0;
pos += eeprom_read(pos, &dr, 1);
DEBUG("[semtech-loramac] reading rx2 dr: %d\n", dr);
semtech_loramac_set_rx2_dr(mac, dr);
@fjmolinas can you try to send the confirmable messages after a reset with settings saved to eeprom? (when the settings are saved and no join is necessary).
@kYc0o I'm getting the same behaviour as you. I'll keep looking into it.
The only thing that is stored on EEPROM in the MAC context is the uplink counter value, see here.
All other information is read once during initialization or written when the user explicitly does this (when calling loramac save
from the shell for example).
Can you try to comment out the call to _save_uplink_counter
from the event loop ? Then erase the information on EEPROM, join the network, save, reboot and send.
I did it and I have the same behaviour. I suspect now #11541, which changed completely the receiving mechanism.
I suspect now #11541, which changed completely the receiving mechanism.
But you said initially that it was working without periph_eeprom
feature. And your bisect doesn't point to #11541.
True, I'll need to make more tests by removing some of the newly introduced commits to check what's exactly triggering the problems.
@kYc0o where you able to isolate the problem from your new commits? Are you still facing this problem?
Some good news with this issue: I was able to reproduce the issue on im880b board (same CPU) and could fix it (hopefully).
In fact, I found several issues:
eeprom_read
instead of eeprom_read_byte
. This is still unclear to me why this fixes the crash.LoRaMacMibGetRequestConfirm
/LoRaMacMibSetRequestConfirm
functions. I have a fix for this but I still need to see if I can do better.tests/pkg_semtech-loramac
: once by auto_init_loramac
and once by the main function. I'll open a PR to remove the initialization done in the main function, this is an easy one.I still need to do more testing before opening PRs.
@fjmolinas yes the issue is still present.
@aabadie thanks for your investigations! I'll test your fixes and try to get a bit deeper into the reasons. I'll post if I find something.
@kYc0o #11783 should solve your issue. I also opened #11777 to address the last point.
Description
Apparently, #11552 broke compatibility with the lobaro-lorabox board.
According to bisect:
Steps to reproduce the issue
Build tests/pkg_semtech-loramac on current master with the lobaro-lorabox board, and flash it:
Expected results
Actual results
Versions