Closed yegorich closed 5 years ago
@yegorich IMHO, the issue title is somehow misleading. It suggests that it is a problem of threaded applications and threaded application aren't working at all. There are a large number of threaded applications like examples/gnrc_networking
which use a lot of threads and are working without any problems. Could you find a subject that is more sepecific?
I have such a feeling that applications providing a shell don't suffer from this issue. lorawan example just ends with a return after sending the first message:
/* trigger the first send */
msg_t msg;
msg_send(&msg, sender_pid);
return 0;
Is there any official way to end the main thread in an event loop?
lorawan
example just ends with a return after sending the first message:
This shouldn't be the reason. According to your log, this return
isn't reached at all. Otherwise you would have the Starting join procedure
message in your log.
examples/lorawan
as well as tests/pkg_semtech_lorawan
initialize the SemTech SX127X in the main
thread. My impression is that any of the semtech_lorawan_*
functions before
semtech_loramac_init(&loramac);
semtech_loramac_set_deveui(&loramac, deveui);
semtech_loramac_set_appeui(&loramac, appeui);
semtech_loramac_set_appkey(&loramac, appkey);
semtech_loramac_set_dr(&loramac, LORAMAC_DR_5);
does not return. Could you identify which function does not return by inserting printf debug messages? My guess is that semtech_loramac_init
doesn't return?
As I understood, tests/pkg_semtech_lorawan
is working and you are able to call all these functions step by step from shell. Right?
Yes, tests/pkg_semtech_lorawan
is working and I am able to call all these functions step by step from shell.
lorawan initialization never goes beyond LoRaMacInitialization
in pkg/semtech-loramac/contrib/semtech_loramac.c
.
How can I add printfs to LoRaMac-node? Every time I invoke make it seems to clone this repository from scratch and all my changes will be removed:
user@ubuntuat:~/Documents/versioned/RIOT$ make BOARD=esp32-ttgo-t-beam -C examples/lorawan/ flash term
make: Entering directory '/home/user/Documents/versioned/RIOT/examples/lorawan'
Building application "lorawan" for "esp32-ttgo-t-beam" with MCU "esp32".
if [ 3bfef71b1bea135ebd7f4cb8a2c7ccd741018441 != 1cdd9ccec4c9f05b616e7112059be4a9e358c571 ] ; then \
git -C /home/user/Documents/versioned/RIOT/examples/lorawan/bin/pkg/esp32-ttgo-t-beam/semtech-loramac clean -xdff ; \
git -C /home/user/Documents/versioned/RIOT/examples/lorawan/bin/pkg/esp32-ttgo-t-beam/semtech-loramac fetch "https://github.com/Lora-net/LoRaMac-node.git" "1cdd9ccec4c9f05b616e7112059be4a9e358c571" ; \
git -C /home/user/Documents/versioned/RIOT/examples/lorawan/bin/pkg/esp32-ttgo-t-beam/semtech-loramac checkout -f 1cdd9ccec4c9f05b616e7112059be4a9e358c571 ; \
touch /home/user/Documents/versioned/RIOT/examples/lorawan/bin/pkg/esp32-ttgo-t-beam/semtech-loramac/.git-downloaded ; \
fi
Removing .git-downloaded
Removing .git-patched
Removing Makefile
Removing src/boards/mcu/Makefile
Removing src/mac/Makefile
Removing src/mac/region/Makefile
Removing src/system/crypto/Makefile
From https://github.com/Lora-net/LoRaMac-node
* branch 1cdd9ccec4c9f05b616e7112059be4a9e358c571 -> FETCH_HEAD
Warning: you are leaving 3 commits behind, not connected to
any of your branches:
3bfef71b Add RU864 LoRaWAN region
309250ed Applying: patch utilities functions
4794848f adapt to RIOT
The only way to change something in the source code of a package is to generate a patch file in pkg/semtech_lorwan/patches
, see section Packages
OK. I see. Btw adding a shell to lorawan example automagically solves my problem:
/* trigger the first send */
msg_t msg;
msg_send(&msg, sender_pid);
char line_buff[SHELL_DEFAULT_BUFSIZE];
shell_run(NULL, line_buff, SHELL_DEFAULT_BUFSIZE);
return 0;
Btw adding a shell to lorawan example automagically solves my problem
Confused :confused:
BTW, the threads that are running are exactly the same for both applications when LoRaMacInitialization
is called.
tests/pkg_semtech_loramac:
pid | name | state Q | pri | stack ( used) | base addr | current
- | isr_stack | - - | - | 2048 ( 0) | 0x3ffb0220 | 0x3ffb0a20
1 | idle | pending Q | 31 | 2048 ( 300) | 0x3ffb190c | 0x3ffb1fe0
2 | main | running Q | 15 | 3072 ( 1228) | 0x3ffb210c | 0x3ffb2a40
3 | recv_thread | bl rx _ | 14 | 2048 ( 492) | 0x3ffb2fec | 0x3ffb3600
| SUM | | | 9216 ( 2020)
examples/lorawan:
pid | name | state Q | pri | stack ( used) | base addr | current
- | isr_stack | - - | - | 2048 ( 0) | 0x3ffb0220 | 0x3ffb0a20
1 | idle | pending Q | 31 | 2048 ( 292) | 0x3ffb1804 | 0x3ffb1ee0
2 | main | running Q | 15 | 3072 ( 1124) | 0x3ffb2004 | 0x3ffb29a0
3 | recv_thread | bl rx _ | 14 | 2048 ( 484) | 0x3ffb34e4 | 0x3ffb3b00
| SUM | | | 9216 ( 1900)
Btw adding a shell to lorawan example automagically solves my problem:
I thought that it might have to do with some module dependencies when module shell
is added, but the only differences I see is the module itself :worried: So the question is, what makes the difference?
This is something I would also like to know. This happens either only to me or T-Beam. @fcgdam has ported TTGO Lora V1 to RIOT (https://github.com/fcgdam/RIOT_TTGOESP32_LORA_V1) and doesn't seem to encounter this issue.
To make the things even more confusing, just disabling DEVELHELP
makes loarwan example run without issues:
make DEVELHELP=0 BOARD=esp32-ttgo-t-beam -C examples/lorawan/ flash term
Hi: I had issues, but not board resets. My issue was that the semtech_loramac_send blocked and never returned. I've checked and I could see that the TX DONE event was triggered but was never propagated to the calling thread. I've wasted a few days with it, but since before starting work I would do a git pull on the full RIOT repository, out of the blue one day started to work.
Edit: Remember it: The board definition for the SX1276 pins did change.
Can you try this definition: `//#define SX127X_PARAM_SPI (SPI_DEV(0)) // Yes, it is commented out
.nss_pin = SX127X_PARAM_SPI_NSS, \
.reset_pin = SX127X_PARAM_RESET, \
.dio0_pin = SX127X_PARAM_DIO0, \
.dio1_pin = SX127X_PARAM_DIO1, \
.dio2_pin = SX127X_PARAM_DIO2, \
.dio3_pin = SX127X_PARAM_DIO3, \
.paselect = SX127X_PARAM_PASELECT \
}`
Hi: Can confirm. With my board definition and the latest pull from RIOT, the board also boot loops.
Thanks for the test. Good to know, I am not the only one. Have you also tried to compile with DEVELHELP=0
?
I have the same issue with an application reading GPS and trying to parse it. The basic structure is the same as tests/periph_uart
but without a shell. As soon as I get \n
, the buffer will be parsed in a printer
thread using minmea
. With DEVELHELP=1
the program provokes resets, with DEVELHELP=0
everything is working as expected.
So it is not LoRa related.
@fcgdam
With my board definition and the latest pull from RIOT, the board also boot loops.
Does this mean that it worked before but not anymore? Could you try it again with an older commit to be sure that it is not related to PR #10883?
git checkout 33a4b307a10b9cb456f85bf36ca440b5cf0ff44b
Hi! I was testing that right now! What a coincidence :) I was backtracking the commits to see what is the one that breaks things: So: 9aa8c619c1b984804b7fb8e27395b0965d1581d4 -> Works OK. (I was testing this random commit) 33a4b307a10b9cb456f85bf36ca440b5cf0ff44b -> Works OK.
With this commit: 43dedfcd2c52c15fb083e0d984c23c6f6e550cb6 it fails.
So the issue is somewhere behind this commit April the 18th.
Edit: Commit from 15 of April: 9aa8c619c1b984804b7fb8e27395b0965d1581d4 -> Works
Ok, so: The commit from 15th of April it works: 9aa8c619c1b984804b7fb8e27395b0965d1581d4 -> OK
The next commits that are even not related to the ESP fail... bbbb62b5718ce66bffdf384b3a0a3854341daab1 -> fails! Also 75d9e7892cd825ba1c50e20a68f96eda67b01b6b -> Fails! The next commit: 8c708110e53bbd5d292f9760df07e588f49866da -> Fails
And from this point everything fails until the HEAD.
Strange indeed!
Commit 9aa8c619c1b984804b7fb8e27395b0965d1581d4 is derived from a master before PR #10883 was merged. You would have to compare commit 33a4b307a10b9cb456f85bf36ca440b5cf0ff44b (the commit on which PR #10883 bases) with commit 35c617e08a25238d3c7affe3fe535c7ca2a7658d (the first commit after the merge of PR #10883).
If 33a4b307a10b9cb456f85bf36ca440b5cf0ff44b works but 35c617e08a25238d3c7affe3fe535c7ca2a7658d not, the problem might be (or better is :sunglasses:) caused by the refactoring of Xtensa vendor code with PR #10883 what would seem to be reasonable. Before PR #10883, esp8266
and esp32
were using their own implementations. Now, the implementations are merged to esp_common
.
Hi:
Yes: Commit 33a4b307a10b9cb456f85bf36ca440b5cf0ff44b works, but 35c617e08a25238d3c7affe3fe535c7ca2a7658d does not.
Thank you a lot for testing :smile: Even though these are really bad news, they make sense.
I will have to compare line by line what the differences are. Unfortunatly, I made these changes almmost a half year ago and I don't remember exactly how big the differences in case of esp32
are. As I remember, I had to change the esp8266
code much more than the esp32
code.
@yegorich Unfortunately, I'm quite busy with my business and don't know when I will have enough time to check the code. Would it be a big problem if it takes 2 or 3 weeks. I will do it as soon as I can.
I could figure out the cause of the problem. With commit 28d9599d52cf07d7282da1931cb240c027fea1f6, I introduced the following "fix", at least I thought it would be a fix.
--- a/cpu/esp32/thread_arch.c
+++ b/cpu/esp32/thread_arch.c
@@ -186,1 +186,1 @@ char* thread_stack_init(thread_task_func_t task_func, void *arg, void *stack_sta
- p = (uint32_t *)(((uint32_t) top_of_stack+1 - XT_CP_SIZE));
+ p = (uint32_t *)(((uint32_t)(top_of_stack + 1) - XT_CP_SIZE) & ~0xf);
But in fact, it was a bug :dark_sunglasses: Since top_of_stack
isn't aligned to 16 byte address and top_of_stack
as well ass XT_CP_SIZE
are used unaligned in cpu/esp_common/vendor/xtensa/portasm.S
, this alignment down to the next 16 byte address stores a wrong pointer to the coprocessor save area in the initial thread context :grimacing:
Reverting the change seems to work. Even though I have no working LoRaWAN gateway, I could see the following outputs with my Heltec WiFi LoRa 32 V2.
Starting RIOT kernel on PRO cpu
I (247) [main_trampoline]: main(): This is RIOT! (Version: 2019.07-devel-141-g203c-boards/esp32-heltec-lora32)
LoRaWAN Class A low-power application
=====================================
[semtech-loramac] initializing loramac
Starting join procedure
[semtech-loramac] Starting join procedure: 0
[semtech-loramac] loramac cmd msg
[semtech-loramac] starting OTAA join
[semtech-loramac] Transmission completed
[semtech-loramac] MAC timer timeout
[semtech-loramac] MAC timer timeout
[semtech-loramac] MAC timer timeout
[semtech-loramac] MAC timer timeout
[semtech-loramac] MAC timer timeout
[semtech-loramac] MAC timer timeout
[semtech-loramac] MAC timer timeout
[semtech-loramac] MAC timer timeout
[semtech-loramac] MAC timer timeout
[semtech-loramac] MAC timer timeout
[semtech-loramac] RX timer timeout
[semtech-loramac] MAC timer timeout
[semtech-loramac] MLME confirm event
[semtech-loramac] join not successful
[semtech-loramac] loramac join notification msg
Join procedure failed
I will provide a PR which fixes the problem. BTW, esp8266
has same problem but since it doesn't have a coprocessor it doesn't matter there.
Hi thanks:.
No issue whatsoever waiting for a fix for me. I'm also quite busy with other things :)
Not sure about @yegorich.
@gschorcht great news! That was fast. I can test your fix next week as soon as I'm in my office. Btw. I'm using N-Fuse card for LoRaWAN tests.
@fcgdam have you thought about upstreaming your board definition?
Hi!
Did the change manually (removed the & ~0xf) and it works now! So the issue seems to be this. At home I only have a single channel gateway, so I use that to do some basic tests, it seems to work now. For full tests only at my office (I also have a N-Fuse based gateway).
@yegorich Regarding up-streaming my board definition not quite sure if its at a good RIOT-OS standard (Missing Arduino pins, and so on), so that was the main reason why I didn't create a PR, and so let users find it out the board definition by themselves :)
@fcgdam I think Adruino pins are optional. I'm going to drop them. If someone really needs this functionality, it can be added later.
Btw. you don't have to specify #define SX127X_PARAMS
as they are defined in drivers/sx127x/include/sx127x_params.h
.
I'm porting TTGO T-Beam board to RIOT (#11418). Executing
examples/lorawan
produces following output:tests/pkg_semtech-loramac
is working without an issue and I can join a LoRaWAN network and send/receive packets.@gschorcht suggested to enable syscall debug messages to see when
system_wdt_feed
is called the last time. The console output shows that it will be called before loramac init and 6 seconds later WDT resets the system.Steps to reproduce the issue
Flash examples/lorawan on a TTGO T-Beam board.
Expected results
LoRaWAN example starts and tries to join the network and sends packets.
Actual results
The system will be reset via WDT.
Versions
Riotdocker was used to build the application.