helium / router

router combines a LoRaWAN Network Server with an API for console, and provides a proxy to the Helium blockchain
Apache License 2.0
70 stars 31 forks source link

SX126x devices can't join to the network #833

Open tomasz-grabowski-netbulls opened 1 year ago

tomasz-grabowski-netbulls commented 1 year ago

I cannot join our devices with SX1262 chips to the Helium network. The same devices joins without issues to TTN and ChirpStack. Also the same devices configured for EU868 region joins without issues to Helium. Logs on gateway and in the console indicates that there is some communication but in device logs I don't see any return communication from gateway.

We have own router and console - OUI 38 (but we also cannot join to Helium hosted console) Devices connects through two data only gateways (Tektelic Kona Micro):

Our devices firmware uses Mbed OS stack and sx126x library. I would also add that the devices with SX1261 chips and firmware based on the same sx126x library joins to Helium network in EU/US/AU regions without issues.

Device log: device_log

Gateway log: gateway_log

Console: console_screenshot

tomasz-grabowski-netbulls commented 1 year ago

After further investigation it looks like we have problem with all devices based on SX126x chips and also EU868 region. Devices that worked just a few weeks ago, now cannot join. We didn't change anything in our firmware.

Maybe it is related somehow to this issue https://github.com/helium/router/issues/818?

jdgemm commented 1 year ago

Thanks for providing this information, we are able to replicate the issue and working on a fix. This is high priority for the team.

jdgemm commented 1 year ago

A new hotspot firmware (2022.08.02.0) was released yesterday along with a fix to this issue. Once your hotspot firmware is updated, please check to see if it resolves your issue. Thanx!

tomasz-grabowski-netbulls commented 1 year ago

No, still no luck.

I've tested with two EU868 hotspots: Bobcat - genuine-orchid-penguin DIY alpha - flaky-tortilla-skunk - x86 + Tektelic Kona Micro GW

In Console I see that sometimes elegant-pine-wombat hear our devices, but it doesn't change anything. Still can't join.

jdgemm commented 1 year ago

Did you verify they were running latest firmware? It takes time for manufacturers to upgrade their fleet.

tomasz-grabowski-netbulls commented 1 year ago

Yes. Both miners have latest software (2022.08.02.0_GA).

yosensi commented 1 year ago

I just wanted to bump up the issue. Our nodes still have big problems to join to the network. Last 2 days we was testing many nodes (US915) with data only hotspot with newest gateway-rs release (Bent Shadow Barbel - 13DbLDgruSUwYtj197FG8HJW3LeDDpSSioNHbjYSkFwM8t5evJo). Sometimes they joins, but very rare and after many hours of Join Accept/Join Request loop.

At the same time, there is no problem with join to TTN and Chirpstack network servers.

ke6jjj commented 1 year ago

From the logs you've posted, @grabtom, it looks like you're not using the default Semtech nor Helium lora_pkt_fwd service to connect to your concentrator. Is that correct? If true, can you tell me what you are using, instead?

Also, can you provide more detail behind to two log pictures you've posted? For each picture:

tomasz-grabowski-netbulls commented 1 year ago

Hi. Just to be clear, the message from @yosensi was from me (I just forgot to relog on github).

ke6jjj commented 1 year ago

Again, @grabtom, please answer one at a time.

The picture labelled "gateway.log": please tell me all of the above for this picture.

The picture labelled "device.log": please tell me all of the above for this picture.

You state that you were doing US915 testing, but your gateway logs show that your gateway is passing uplinks from frequencies that are not a part of the Helium US915 set. We need to understand how you've configured the packet forwarder -- the channels don't look correct to me.

tomasz-grabowski-netbulls commented 1 year ago

Okey, sorry. I will try to explain. When I reported the issue a month ago we were testing our devices on both - US915 and AU915 frequencies. I gave addresses of both gateways in the message.

Log images in the first message were taken while testing AU915 frequency.

gateway.log The log comes from the data-only hotspot Melted Quartz Meerkat (12zDAeurUkXQA2avG3bL7rY34DWN8GVSfxGso4o5wrsPn1rDWp9). This is a Tektelic Kona Micro gateway with factory firmware and gateway-rs package from github.

Packet forwarder configuration:

    "SX1301_array_conf":[
        {
        "board_freq_band": "AU915",
        "SX1301_conf":[
        {
            "chip_enable": true,
            "chan_multiSF_0": { "chan_rx_freq": 916800000, "spread_factor": "7-10" },
            "chan_multiSF_1": { "chan_rx_freq": 917000000, "spread_factor": "7-10" },
            "chan_multiSF_2": { "chan_rx_freq": 917200000, "spread_factor": "7-10" },
            "chan_multiSF_3": { "chan_rx_freq": 917400000, "spread_factor": "7-10" },
            "chan_multiSF_4": { "chan_rx_freq": 917600000, "spread_factor": "7-10" },
            "chan_multiSF_5": { "chan_rx_freq": 917800000, "spread_factor": "7-10" },
            "chan_multiSF_6": { "chan_rx_freq": 918000000, "spread_factor": "7-10" },
            "chan_multiSF_7": { "chan_rx_freq": 918200000, "spread_factor": "7-10" },
            "chan_LoRa_std" : { "chan_rx_freq": 917500000, "bandwidth": 500000, "spread_factor": 8 },
            "chan_FSK"      : { "chan_rx_freq": 917500000, "bandwidth": 250000, "bit_rate": 100000 }
        }
        ],
        "lbt_enabled":false,
        "lbt_threshold":-80,
        "loramac_public":true
        }
    ],
    "gateway_conf": {
        "gateway_ID": "647fdafffe00833f",
        "server_address": "localhost",
        "serv_port_up": 1680,
        "serv_port_down": 1680,
        "keepalive_interval": 10,
        "stat_interval": 30,
        "push_timeout_ms": 100,
        "forward_crc_valid": true,
        "forward_crc_error": false,
        "forward_crc_disabled": false
    }
}

device.log The log comes from our device configured on the AU915 frequency. It seems to me that the frequencies on both pictures match.

ke6jjj commented 1 year ago

Thank you! This is very helpful.

ke6jjj commented 1 year ago

Considering the Hotspot Melted Quartz Meerkat (12zDAeurUkXQA2avG3bL7rY34DWN8GVSfxGso4o5wrsPn1rDWp9), which is running gateway-rs and its Tektelic Kona Micro gateway, the packet forward configuration you've given appears to be incomplete.

I can't find anything which directs the gateway about its transmit frequencies. Is the JSON you've attached the entire JSON? I'm wondering if your gateway can't send out the join ACCEPT response because it can't transmit at all. I say this because I do notice that gateway.log has this error in it:

WARN ignoring rx1 downlink error: Ack(InvalidTransmitFrequency), module: gateway
tomasz-grabowski-netbulls commented 1 year ago

This is the whole configuration of packet forwarder and it works when I use it with Chirpstack, so I assume that is not gateway problem. I wrote in next messages that the we have problems also with EU868 and US915. We have regular EU868 hotspots here, so I don't think it's a problem with incorrect gateway configuration.

If you want, I can also run tests on US915 frequencies using RAK2287, Raspberry PI, helium packet forwarder and collect logs again.

ke6jjj commented 1 year ago

There's clearly some sort of interaction that's incompatible here, not necessarily the fault of the gateway itself. I'd like to get to the bottom of that. So, to that aim, is there more to the configuration of the gateway?

tomasz-grabowski-netbulls commented 1 year ago

No there isn't. It is the only configuration file of packet forwader in that gateway.

ke6jjj commented 1 year ago

Alright. Is there any other Hotspot you have access to that can give you detailed logs like the one you used for "gateway.log"? It would be incredibly helpful to see the output of lora_pkt_fwd on from an sx1301/1302 concentrator and the gateway-rs logs at the same time.

tomasz-grabowski-netbulls commented 1 year ago

Hi there. I'm sending bunch of logs.

console-event-debug.json.txt - debug JSON from our Console device.log - log from lora device helium_gateway.log - gateway_rs(https://github.com/helium/gateway-rs) lora_pkt_fwd.log - lora packet forwarder (https://github.com/helium/sx1302_hal/)

Packet forwarder and gateway-rs are running on Raspberry Pi3 with RAK2287 concentrator. Packet forwarder config:

{
    "SX130x_conf": {
        "spidev_path": "/dev/spidev0.0",
        "lorawan_public": true,
        "clksrc": 0,
        "antenna_gain": 0, /* antenna gain, in dBi */
        "full_duplex": false,
        "precision_timestamp": {
            "enable": false,
            "max_ts_metrics": 255,
            "nb_symbols": 1
        },
        "radio_0": {
            "enable": true,
            "type": "SX1250",
            "freq": 904300000,
            "rssi_offset": -215.4,
            "rssi_tcomp": {"coeff_a": 0, "coeff_b": 0, "coeff_c": 20.41, "coeff_d": 2162.56, "coeff_e": 0},
            "tx_enable": true,
            "tx_freq_min": 902000000,
            "tx_freq_max": 928000000,
            "tx_gain_lut":[
                {"rf_power": 12, "pa_gain": 1, "pwr_idx": 4},
                {"rf_power": 13, "pa_gain": 1, "pwr_idx": 5},
                {"rf_power": 14, "pa_gain": 1, "pwr_idx": 6},
                {"rf_power": 15, "pa_gain": 1, "pwr_idx": 7},
                {"rf_power": 16, "pa_gain": 1, "pwr_idx": 8},
                {"rf_power": 17, "pa_gain": 1, "pwr_idx": 9},
                {"rf_power": 18, "pa_gain": 1, "pwr_idx": 10},
                {"rf_power": 19, "pa_gain": 1, "pwr_idx": 11},
                {"rf_power": 20, "pa_gain": 1, "pwr_idx": 12},
                {"rf_power": 21, "pa_gain": 1, "pwr_idx": 13},
                {"rf_power": 22, "pa_gain": 1, "pwr_idx": 14},
                {"rf_power": 23, "pa_gain": 1, "pwr_idx": 15},
                {"rf_power": 24, "pa_gain": 1, "pwr_idx": 16},
                {"rf_power": 25, "pa_gain": 1, "pwr_idx": 17},
                {"rf_power": 26, "pa_gain": 1, "pwr_idx": 19},
                {"rf_power": 27, "pa_gain": 1, "pwr_idx": 20}
            ]
        },
        "radio_1": {
            "enable": true,
            "type": "SX1250",
            "freq": 905000000,
            "rssi_offset": -215.4,
            "rssi_tcomp": {"coeff_a": 0, "coeff_b": 0, "coeff_c": 20.41, "coeff_d": 2162.56, "coeff_e": 0},
            "tx_enable": false
        },
        "chan_multiSF_0": {
            /* Channel 8, 903.900 Mhz */
            "enable": true,
            "radio": 0,
            "if": -400000
        },
        "chan_multiSF_1": {
            /* Channel 9, 904.100 Mhz */
            "enable": true,
            "radio": 0,
            "if": -200000
        },
        "chan_multiSF_2": {
            /* Channel 10, 904.300 Mhz */
            "enable": true,
            "radio": 0,
            "if": 0
        },
        "chan_multiSF_3": {
            /* Channel 11, 904.500 Mhz */
            "enable": true,
            "radio": 0,
            "if": 200000
        },
        "chan_multiSF_4": {
            /* Channel 12, 904.700 Mhz */
            "enable": true,
            "radio": 1,
            "if": -300000
        },
        "chan_multiSF_5": {
            /* Channel 13, 904.900 Mhz */
            "enable": true,
            "radio": 1,
            "if": -100000
        },
        "chan_multiSF_6": {
            /* Channel 14, 905.100 Mhz */
            "enable": true,
            "radio": 1,
            "if": 100000
        },
        "chan_multiSF_7": {
            /* Channel 15, 905.300 Mhz */
            "enable": true,
            "radio": 1,
            "if": 300000
        },
        "chan_Lora_std": {
            /* Channel 65 (fat channel), 912.6 Mhz */
            "enable": true,
            "radio": 0,
            "if": 300000,
            "bandwidth": 500000,
            "spread_factor": 8
        },
        "chan_FSK": {
            /* disabled */
            "enable": false,
            "radio": 0,
            "if": 300000,
            "bandwidth": 250000,
            "datarate": 100000
        }
    },
    "gateway_conf": {
        "gateway_ID": "AA555A00000000AA",
        /* change with default server address/ports */
        "server_address": "127.0.0.1",
        "serv_port_up": 1680,
        "serv_port_down": 1680,
        /* adjust the following parameters for your network */
        "keepalive_interval": 10,
        "stat_interval": 10,
        "push_timeout_ms": 100,
        /* forward only valid packets */
        "forward_crc_valid": true,
        "forward_crc_error": false,
        "forward_crc_disabled": false,
        /* GPS configuration */
        "gps_tty_path": "/dev/ttyAMA0",
        /* GPS reference coordinates */
        "ref_latitude": 0.0,
        "ref_longitude": 0.0,
        "ref_altitude": 0
    }
}
ke6jjj commented 1 year ago

@grabtom this is a really good start. Thank you! Is the console_event_debug.json.txt sorted by reverse chronological order?

ke6jjj commented 1 year ago

I think we may be getting closer to the heart of the matter. In at least one instance, the console is not sending the JOIN_ACCEPT to your device at the same frequency it appears to be listening on.

In the example below, we see that your device sent an initial JOIN on the channel 65 (frequency 904.6 MHz), using Data rate 4 (SF8BW500) and then proceeded to listen for RX1 on 923.900 MHz. The console logs, however, show that the console transmitted the JOIN ACCEPT on 925.100 MHz.

Device device.log:15

[2022-09-01 14:28:54]  [DBG ][LMAC]: TX: Channel=65, TX DR=4, RX1 DR=13
[2022-09-01 14:28:56]  [DBG ][LMAC]: RX1 slot open, Freq = 923900000
[2022-09-01 14:28:59]  [DBG ][LMAC]: RX2 slot open, Freq = 923300000

Console console-event-debug.json.txt:2246

{
    "category": "join_accept",
    "data": {
      "devaddr": "38090048",
      "fcnt": 0,
      "hotspot": {
        "channel": 8,
        "frequency": 925.1,
        "id": "13DbLDgruSUwYtj197FG8HJW3LeDDpSSioNHbjYSkFwM8t5evJo",
        "lat": 42.4738454813747,
        "long": -73.81504609626082,
        "name": "bent-shadow-barbel",
        "rssi": 30,
        "snr": 0,
        "spreading": "SF7BW500"
      },
tomasz-grabowski-netbulls commented 1 year ago

Good point. Any ideas why Console is sending JOIN ACCEPT at the wrong frequency? Anyway, I see that every second JOIN ACCEPT matches the frequency to the device.

Any thoughts?

mikev commented 1 year ago

We do have a correction pending in our LoRaWAN library for US915 which will likely fix this issue. Link to modified US915https://github.com/helium/erlang-lorawan/pull/21

The config for AU915 appears correct. Can somebody confirm whether this same issue exists for AU915 or not?

This is the comment which led to the issue - https://github.com/helium/sx1302_hal/pull/31

The US915 500 kHz channel is actually 904.6 as determined by examining the firmware.

The math is that radio0 == 904300000 + 300000 = 904.6

mikev commented 1 year ago

Reference to Library update PR https://github.com/helium/router/pull/862

lketchersid commented 1 year ago

We seem to be having a similar problem with two new sensors - by similar I mean the JOIN_ACCEPT frequency doesn't match the JOIN_REQUEST frequency. I've attached the JSON from the console (VIP console v2.2.23) JoinAcceptSoilevent-debug.json.txt JoinRequestSoilevent-debug.json.txt

mikev commented 1 year ago

@lketchersid - For most plans, but especially AU915 and US915 we would never expect the Join Request frequency to match the Accept frequency. We would expect them to be different.