Closed RTK221B closed 2 years ago
I'm seeing similar issues appears to have manifested this past week. Here is a problem description associated with the RAK/wisblock mappers. Join REQ_ACC looping.pdf
Could we know what is the hotspot in use? Could user also try threw a different hotspot just in case?
I just performed a Console upgrade, was previously on v2.2.14 and my device worked fine (albeit there were many grpc crash logs in the router). However, now my device is looping and not being accepted when joining:
Update: After about 40 failures, it finally connected.
Tried disconnecting and re-connecting and it is back to looping!
12:52:41.514 -> <AS923> Using TX Frequency 923400000
12:52:41.514 -> <LM> OnRadioTxDone
12:52:41.514 -> <LM> OnRadioTxDone => RX Windows #1 5002 #2 6002
12:52:41.514 -> <LM> OnRadioTxDone => TX was Join Request
12:52:43.809 -> <RADIO> RX window timeout = 3000
12:52:56.456 -> <RADIO> RadioIrqProcess => IRQ_RX_TX_TIMEOUT
12:52:56.456 -> <LM> OnRadioRxTimeout
12:52:56.456 -> <LM> Join network failed 57 time(s)
This just happened to me for the first time. I've just onboarded 8 same devices, and it only happened to one. event-debug (27).json.zip
UPDATE: out of 20 same devices, one gets stuck in this loop and never joins no matter what I do. I tried deleting the device and re-adding it to the console. UPDATE 2: deleting and re-adding did not work - changing the device EUI on both device & console did work.
i have this problem since months. on more than 50 devices it happens for around 20% (could be much higher, i cannot verify that however). Different hotspots, differen locations, sometimes device next to gateway.
Just wanted to confirm we are facing the same issue with the Dragino LPS8. I can steadily reproduce the issue with the Dragino LPS8. We tested both EU868 and US915 versions of the LPS8 to see if that had any impact but both got stuck in the Join loop. It seems that the LPS8 doesn't actually send the downlink. Official Helium gateway seems to work fine.
We sometimes see this in the log. But it actually has the region parameters.
Wed Jul 27 08:44:13 2022 daemon.info helium_gateway[9253]: WARN ignoring downlink, no region params, module: gateway
We will try reverting to older FW of the Dragino to see if that solves anything.
Just wanted to confirm we are facing the same issue with the Dragino LPS8. I can steadily reproduce the issue with the Dragino LPS8. We tested both EU868 and US915 versions of the LPS8 to see if that had any impact but both got stuck in the Join loop. It seems that the LPS8 doesn't actually send the downlink. Official Helium gateway seems to work fine.
We sometimes see this in the log. But it actually has the region parameters.
Wed Jul 27 08:44:13 2022 daemon.info helium_gateway[9253]: WARN ignoring downlink, no region params, module: gateway
We will try reverting to older FW of the Dragino to see if that solves anything.
Try asserting a location for the hotspot with that error? Downlinks need to know the region parameters to regulate maximum tx power. This requires the hotspot to be location asserted
Just wanted to confirm we are facing the same issue with the Dragino LPS8. I can steadily reproduce the issue with the Dragino LPS8. We tested both EU868 and US915 versions of the LPS8 to see if that had any impact but both got stuck in the Join loop. It seems that the LPS8 doesn't actually send the downlink. Official Helium gateway seems to work fine. We sometimes see this in the log. But it actually has the region parameters.
Wed Jul 27 08:44:13 2022 daemon.info helium_gateway[9253]: WARN ignoring downlink, no region params, module: gateway
We will try reverting to older FW of the Dragino to see if that solves anything.Try asserting a location for the hotspot with that error? Downlinks need to know the region parameters to regulate maximum tx power. This requires the hotspot to be location asserted
I did what you suggested the gateway shows asserted in my wallet hotspot list. But now I get this error when a JOIN REQ is received. I'm on "helium_gateway 1.0.0-alpha.29" and Dragino LPS8 firmware "lgw-5.4.1655274924"
Fri Jul 29 07:53:53 2022 daemon.info fwd[10936]: PKTUP~ [server] JSON: {"rxpk":[{"jver":1,"tmst":368878588,"chan":3,"rfch":0,"freq":904.500000,"mid":0,"stat":1,"modu":"LORA","datr":"SF10BW125","codr":"4/5","rssis":-40,"lsnr":8.8,"foff":0,"rssi":-40,"size":23,"data":"APxFDBfAYxGkVgAAAMNjEaQBBAyS0rw="}]} Fri Jul 29 07:53:53 2022 daemon.info fwd[10936]: INFO~ [server2-up] received packages from mote: 00000000 (fcnt=0) Fri Jul 29 07:53:53 2022 daemon.info fwd[10936]: PKTUP~ [server2] JSON: {"rxpk":[{"jver":1,"tmst":368878588,"chan":3,"rfch":0,"freq":904.500000,"mid":0,"stat":1,"modu":"LORA","datr":"SF10BW125","codr":"4/5","rssis":-40,"lsnr":8.8,"foff":0,"rssi":-40,"size":23,"data":"APxFDBfAYxGkVgAAAMNjEaQBBAyS0rw="}]} Fri Jul 29 07:53:53 2022 daemon.info helium_gateway[11006]: INFO uplink @368878588 us, 904.50 MHz, Ok(DataRate(SF10, BW125)), snr: 8.8, rssi: -40, len: 23 from A8:40:41:20:39:24:41:50, module: gateway Fri Jul 29 07:53:53 2022 daemon.info fwd[10936]: INFO~ [server-up] PUSH_ACK received in 7 ms Fri Jul 29 07:53:53 2022 daemon.info fwd[10936]: JOIN_REQ:{"Size":23, "Rssi":-40, "snr":9, "AppEUI":"A41163C0170C45FC", "DevEUI":"A41163C300000056"} Fri Jul 29 07:53:54 2022 daemon.info fwd[10936]: INFO~ [server-down] PULL_ACK received in 0 ms Fri Jul 29 07:53:54 2022 daemon.info helium_gateway[11006]: WARN ignoring failed uplink Service(Rpc(Status { code: Unknown, message: "no response", metadata: MetadataMap { headers: {"user-agent": "grpc-erlang/0.1.0", "content-type": "application/grpc+proto", "grpc-encoding": "identity"} }, source: None })), oui: 12, uri: http://54.219.236.122:8080/, pubkey: 112ewJNEUfSg3Jvo276tMjzFC2JzmmZcJJ32CWz2fzYqbyCMMTe1, module: router
@RTK221B We are going to test with a new release of Router that will send errors back to gateway RS
@RTK221B making progress, validator component included in release, and now team's working on gateway-rs update.
@RTK221B latest gateway-rs release was released to manufacturers yesterday: https://engineering.helium.com/2022/08/18/miner-hotspot-release.html.
specifically this pr: https://github.com/helium/gateway-rs/pull/285
It typically takes a few days for manufacturers to push out the update, next week please start to reach back out folks with join loop concerns to confirm issues addresed.
Assume latest release addressed this...
still having that same problem
i changed the rx1delay to 3 seconds (was 5) and it seems to work better now. will need to test this further however
Changing RX windows won’t change joining. The Join windows are fixed in the spec.
yeah it seems after it has joined a few times it works directly after resetting the device. however if i try it the next day it first runs into the loop issue again before joining after 1 minute or so
I can visually see the device sending the join request (indicated by a rgb led), then helium sending the accept call back and ~ 2 seconds later the join window 1 closes on the device (indicated by a rgb led).
btw i am using heltec cubecell dev board ab-01 with latest firmware. device spec for 868 region would be this: https://github.com/HelTecAutomation/CubeCell-Arduino/blob/master/libraries/LoraWan102/src/loramac/region/RegionEU868.h
i am using a public hotspot so i don't know what specs that has. however this is its url: https://explorer.helium.com/hotspots/112QZTkUTzE3btS8rJ4zGrY4nDDvyWeZ19i3DPcF7HUGCpoSCAqS
LoRaWAN EU868 Class A start!
trying to join-start
joining...TX on freq 868500000 Hz at DR 5
TX on freq 868500000 Hz at DR 5
Event : Tx Done
RX on freq 868500000 Hz at DR 5
Event : Rx Timeout
RX on freq 869525000 Hz at DR 0
Event : Rx Timeout
TX on freq 868300000 Hz at DR 5
TX on freq 868300000 Hz at DR 5
Event : Tx Done
RX on freq 868300000 Hz at DR 5
Event : Rx Error
RX on freq 869525000 Hz at DR 0
Event : Rx Timeout
TX on freq 868500000 Hz at DR 5
TX on freq 868500000 Hz at DR 5
Event : Tx Done
RX on freq 868500000 Hz at DR 5
Event : Rx Timeout
RX on freq 869525000 Hz at DR 0
Event : Rx Timeout
TX on freq 868500000 Hz at DR 5
TX on freq 868500000 Hz at DR 5
Event : Tx Done
RX on freq 868500000 Hz at DR 5
Event : Rx Timeout
RX on freq 869525000 Hz at DR 0
Event : Rx Timeout
TX on freq 868100000 Hz at DR 5
TX on freq 868100000 Hz at DR 5
Event : Tx Done
RX on freq 868100000 Hz at DR 5
Event : Rx Timeout
RX on freq 869525000 Hz at DR 0
Event : Rx Timeout
TX on freq 868100000 Hz at DR 5
TX on freq 868100000 Hz at DR 5
Event : Tx Done
RX on freq 868100000 Hz at DR 5
Event : Rx Timeout
RX on freq 869525000 Hz at DR 0
Event : Rx Timeout
TX on freq 868500000 Hz at DR 5
TX on freq 868500000 Hz at DR 5
Event : Tx Done
RX on freq 868500000 Hz at DR 5
Event : Rx Timeout
RX on freq 869525000 Hz at DR 0
Event : Rx Timeout
TX on freq 868100000 Hz at DR 4
TX on freq 868100000 Hz at DR 4
Event : Tx Done
RX on freq 868100000 Hz at DR 4
Event : Rx Error
RX on freq 869525000 Hz at DR 0
Event : Rx Timeout
TX on freq 868500000 Hz at DR 5
TX on freq 868500000 Hz at DR 5
Event : Tx Done
RX on freq 868500000 Hz at DR 5
Event : Rx Timeout
RX on freq 869525000 Hz at DR 0
Event : Rx Timeout
TX on freq 868100000 Hz at DR 5
TX on freq 868100000 Hz at DR 5
Event : Tx Done
RX on freq 868100000 Hz at DR 5
Event : Rx Timeout
RX on freq 869525000 Hz at DR 0
Event : Rx Timeout
TX on freq 868500000 Hz at DR 5
TX on freq 868500000 Hz at DR 5
Event : Tx Done
RX on freq 868500000 Hz at DR 5
Event : Rx Timeout
RX on freq 869525000 Hz at DR 0
Event : Rx Timeout
TX on freq 868500000 Hz at DR 5
TX on freq 868500000 Hz at DR 5
Event : Tx Done
RX on freq 868500000 Hz at DR 5
Event : Rx Timeout
RX on freq 869525000 Hz at DR 0
Event : Rx Timeout
TX on freq 868300000 Hz at DR 5
TX on freq 868300000 Hz at DR 5
Event : Tx Done
RX on freq 868300000 Hz at DR 5
Event : Rx Timeout
RX on freq 869525000 Hz at DR 0
Event : Rx Timeout
TX on freq 868300000 Hz at DR 5
TX on freq 868300000 Hz at DR 5
Event : Tx Done
RX on freq 868300000 Hz at DR 5
Event : Rx Timeout
RX on freq 869525000 Hz at DR 0
Event : Rx Timeout
TX on freq 868300000 Hz at DR 5
TX on freq 868300000 Hz at DR 5
Event : Tx Done
RX on freq 868300000 Hz at DR 5
Event : Rx Done
setup
joined
now testing whether activating the CF-List option helps for region 868 https://github.com/helium/router/issues/858 need to test tomorrow, as i said, once the device joined a few times it works. i need it to rest and try tomorrow again. should it directly join it is a strong indicator that the underlying problem is actually the problem described in the ticket
I can visually see the device sending the join request (indicated by a rgb led), then helium sending the accept call back and ~ 2 seconds later the join window 1 closes on the device (indicated by a rgb led).
This should not be the case. JOIN_ACCEPT_DELAY1 is always 5s, and JOIN_ACCEPT_DELAY2 is always 6s.
From these logs
TX on freq 868300000 Hz at DR 5 TX on freq 868300000 Hz at DR 5 Event : Tx Done RX on freq 868300000 Hz at DR 5 Event : Rx Error
It appears the device is sending 2 join request on the same frequency 1 right after the other.
This should not be the case. JOIN_ACCEPT_DELAY1 is always 5s, and JOIN_ACCEPT_DELAY2 is always 6s. Honestly i can be wrong, i just mention what i see on the device. I'm just using the quite popular heltec cubecell board. So you say that in total there SHOULD be 11 seconds at least between join attempts? I also don't see that behaviour on other lora nodes.
The device only is sending one join request, the logs are just showing it two times. i checked and it is always like this for the heltec logs (also checked all logs public available on the internet)
unfortunately the "bug" is still there, even after activating the CF-List. i will continue to inspect but fear there is not much i can do anymore.
Devices not able to join the network
Some devices just go on a long JOIN_REQ and JOIN_ACCEPT loop before they actually join the network. The loop can last a few minutes to many hours. Happens to new devices joining the network as well as to old pre-registered devices. Here are some examples: