sparkfun / SparkFun_LoRaSerial

A simple to use radio modem for long distances using LoRa.
https://docs.sparkfun.com/SparkFun_LoRaSerial/
Other
17 stars 6 forks source link

Comm link too unstable for RTK use. #587

Open tonycanike opened 8 months ago

tonycanike commented 8 months ago

My LoRaSerials are unusable for RTK work. The latency (age) of my RTK solution, as report by the Facets, is very unstable when using the LoRaSerial radios. It will work for a few minutes, then the comm link appears to go down, the latency climbs, I lose RTK fix. Then the comm link returns, I get an RTK fixed solution, things are good for a while, and then it happens all over again.

I do not have this problem with my Holybros or my RFD900x radios. They work perfectly with my two Facets in base rover RTK mode.

Here is my full problem description: https://forum.sparkfun.com/viewtopic.php?f=116&t=60898

Multiple other users seem to be reporting this issue also.

viewtopic.php?f=116&t=60609 viewtopic.php?f=117&t=60671 viewtopic.php?f=117&t=60673 viewtopic.php?f=117&t=60872

nseidle commented 8 months ago

Tony's configuration:

Base/Transmitter/Server ATR AT-AirSpeed=0 AT-AutoTune=0 AT-Bandwidth=500.00 AT-ClientFindPartnerRetryInterval=3 AT-CodingRate=7 AT-DataScrambling=0 AT-EnableCRC16=1 AT-EncryptData=1 AT-EncryptionKey=xxx AT-FramesToYield=3 AT-FrequencyHop=1 AT-FrequencyMax=928.000 AT-FrequencyMin=902.000 AT-HeartBeatTimeout=5000 AT-MaxDwellTime=400 AT-MaxResends=0 AT-NetID=192 AT-NumberOfChannels=50 AT-OperatingMode=0 AT-OverHeadtime=10 AT-PreambleLength=8 AT-SelectLedUse=4 AT-Server=1 AT-SpreadFactor=8 AT-SyncWord=18 AT-TrainingKey=xxx AT-TrainingTimeout=1 AT-TxPower=30 AT-TxToRxUsec=280 AT-VerifyRxNetID=1

ATS AT-CopySerial=0 AT-Echo=0 AT-FlowControl=0 AT-InvertCts=0 AT-InvertRts=0 AT-RTSOffBytes=32 AT-RTSOnBytes=256 AT-SerialDelay=50 AT-SerialSpeed=57600 AT-UsbSerialWait=0 OK

Receiver/Rover ATR AT-AirSpeed=0 AT-AutoTune=0 AT-Bandwidth=500.00 AT-ClientFindPartnerRetryInterval=3 AT-CodingRate=7 AT-DataScrambling=0 AT-EnableCRC16=1 AT-EncryptData=1 AT-EncryptionKey=xxxx AT-FramesToYield=3 AT-FrequencyHop=1 AT-FrequencyMax=928.000 AT-FrequencyMin=902.000 AT-HeartBeatTimeout=5000 AT-MaxDwellTime=400 AT-MaxResends=0 AT-NetID=192 AT-NumberOfChannels=50 AT-OperatingMode=0 AT-OverHeadtime=10 AT-PreambleLength=8 AT-SelectLedUse=4 AT-Server=0 AT-SpreadFactor=8 AT-SyncWord=18 AT-TrainingKey=xxxx AT-TrainingTimeout=1 AT-TxPower=30 AT-TxToRxUsec=280 AT-VerifyRxNetID=1 OK

ats ATS AT-CopySerial=0 AT-Echo=0 AT-FlowControl=0 AT-InvertCts=0 AT-InvertRts=0 AT-RTSOffBytes=32 AT-RTSOnBytes=256 AT-SerialDelay=50 AT-SerialSpeed=57600 AT-UsbSerialWait=0

My radios were in multipoint mode (this makes the most sense for the continuous stream of updated RTK data).

nseidle commented 8 months ago

I had SerialTest send a 100-character string every 500ms to my "server" LoRaSerial radio, and I monitored the output of the other radio with either Teraterm or another instance of SerialTest. The 4 green signal strength LEDs were illuminated on the receiving radio, I think 3 or 4 on the server/transmitting radio.

The yellow LED on the server radio would flash with every string transmitted (every 500ms), and the blue LED on the other radios would flash with every string received (every 500ms). Sometimes, often after 4 or so minutes, the reception would stop, the data wasn't displayed in TeraTerm, the blue LED stopped flashing. Nothing would happen for about 30 seconds. Then the green LEDs would blink on the receiving radio, and everything would start working fine again. Sometimes this would happen again every 3-4 minutes, and sometimes it would not.

It's like there's some bug and the receiving radio loses sync with the transmitting radio, and it has to restart/resync. The transmitting radio shows no apparent anomalous behavior - that yellow LED just keeps blinking every 500ms.

nseidle commented 8 months ago

Thanks for reporting!

after 4 or so minutes, the reception would stop,

It sounds as if the units are getting out of sync. In MP mode, the server is transmitting a clock sync but if the client misses the clock sync multiple times, it will eventually get off frequency and the link will go down. We have the time delay of the client realizing it's truly out of sync, and the time delay where the server has to come back around through the hop table. If the client misses the clock sync, it has another wait for the server to come around again. We have a few mitigations in place to reduce this time: the client will enter a discover_scan mode where it actively pings the hop table but this can come back negative if the server is actively transmitting when the client is pinging.

P2P doesn't have this time delay. Because they are expecting to regularly hear from each other, the desync and sync times are much shorter.

Point-To-Point and Multipoint are very different beasts. Are you seeing similar issues with Point-To-Point?

cturvey commented 8 months ago

Point-To-Point and Multipoint are very different beasts.

Indeed, skimming thru this there doesn't look to be a method where one unit establishes itself as a master station, so they all potentially throw DATAGRAM_SYNC_CLOCKS packets at each other, or back-n-forth

cturvey commented 8 months ago

In the One-to-Many situation, really need to establish one as the primary station, driving the hop time and pattern. The primary should be only one broadcasting DATAGRAM_SYNC_CLOCKS, and ideally this should communicate the time to next hop, and where it's going to go. The rest of the stations need to synchronize to this, and not be sending their own DATAGRAM_SYNC_CLOCKS as this will just result in chaos as there's no indication of who's in charge, some of the stations will not be in range of each other, and apt to be synchronized by the periodicity of the messaging from the GPS receivers.

Perhaps some way to identify who is sending sync messages, and some level of precedence. Data from GPS/GNSS should allow for time domain synchronization around multiple units

tonycanike commented 8 months ago

@nseidle Nathan, thanks for jumping on this.

I don't have a solid answer on your P2P vs. MP question. I'm geographically distant from my equipment right now and hope to work on this more mid-February.

I tried Point-to-Point once when I first got the LoRaSerial radios, but I've been focusing on MP as I believe it makes better sense for the RTK use case. If data is lost forgetaboutit, as updated data will be sent in the next second.

I experienced the MP issues with the two radios within 5 feet of each other on my workbench.

@cturvey I do configure the one radio at my RTK base to be the "server", and the docs say only the server is transmitting the sync heartbeats. I haven't looked at the code though.
https://docs.sparkfun.com/SparkFun_LoRaSerial/operating_modes/

Tony.

cturvey commented 8 months ago

Mostly just skimmed the source doing a quick static-analysis as to what looks to be going on, and where the sync packets are transmitted and where received. Unpacked the hopping somewhat, but suppose if it misses a hop it's going to have to wait until it cycles around. I'd need to get some units to do dynamic analysis and review debug output side-by-side, or back-port onto the DISCO / Murata platform

nseidle commented 8 months ago

Hi @cturvey - I welcome the analysis and help. I can send you hardware if desired, just say the word and I'll PM you for an address.

cturvey commented 8 months ago

TBH the logic as I'm unpacking it suggests that both ends schedule transmission, the server doesn't act on reception

https://github.com/sparkfun/SparkFun_LoRaSerial/blob/62aa9cc391a47dd5f0f9b08da599cfd0d7820ac1/Firmware/LoRaSerial/States.ino#L175

j-w-bullfrog commented 8 months ago

Yes, Point -to Point has the same issues. I havn't seen any performance difference between to two modes, I also tested with the radio's between 1000ft, 10 & 3 ft apart so I could watch the led's.

cturvey commented 8 months ago

@nseidle Nathan, thanks for the units arrived today, battled the IDE and have it building, needed to regress from RadioLib 6.4.2 back to 5.1.2, but do have closure now. Need to find the LoRaSerial driver now. Will dig in.

Made a .INF to pull in USBSER.SYS https://github.com/cturvey/RandomNinjaChef/blob/main/sparkfun_loraserial.inf

cturvey commented 8 months ago

Not sure how you'd like to do support on this. I've got the build process working on two boxes, one with IDE 1.8.19 and the other with IDE 2.2.1 Everything builds and downloads using the GitHub code (LoRaSerial v2.0 ?) When I go into the AT-Server=1, AT-OperatingMode=0 It seems to trap out very quickly, either a power-on-reset or perhaps a watchdog. Do I need to power these externally? Currently just running off USB on an older laptop / powered hub The initial firmware they shipped with the unit didn't reset (USB ding-dong) like this.

cturvey commented 8 months ago

Reverting to the original image shared in the repo. So something in the build/library. Least confident in WDTZero Noting method to push in original using Arduino 2.2.1 IDE / Arduino SAMD Board Package

"C:\Users\xx\AppData\Local\Arduino15\packages\arduino\tools\bossac\1.7.0-arduino3/bossac.exe" -i -d --port=COM8 -U true -i -e -w -v "C:\SparkFun\LoRaSerial\SparkFun_LoRaSerial_v2_0.bin" -R
cturvey commented 8 months ago

Turning off watchdog, last message, then it hangs. "State: MP: Waiting for TX done" Using RadioLib 5.1.2 Using SAMD_TimerInterrupt 1.10.1, will try 1.9.0 ... Ok 1.9.0 is happier, so making some progress

cturvey commented 8 months ago

@nseidle Likely not the source of the issue, but the logic here is broken allowing corrupt packets to be processed down-stream. Should be || (OR) not && (AND)

https://github.com/sparkfun/SparkFun_LoRaSerial/blob/main/Firmware/LoRaSerial/Radio.ino#L1902

    if ((incomingBuffer[rxDataBytes - 2] != (crc >> 8))
        && (incomingBuffer[rxDataBytes - 1] != (crc & 0xff)))
    {

I can compile and run code, so walking, adding/enabling instrumentation, and doing some dynamic analysis. Right now not seeing the link dying, functional 10 Hours to this point. Occasional loss/recovery, with the 10 seconds of Blue LED fast flashing.

HEARTBEAT sent on a consistent/repetitive basis (with server ms time-stamping), SYNC_CLOCKS (with channel, in multi-point) on demand, reporting ACK-1 in response to FIND_PARTNER.

Unpacking the to-and-fro of the protocol in my head.

tonycanike commented 8 months ago

Right now not seeing the link dying, functional 10 Hours to this point. Occasional loss/recovery, with the 10 seconds of Blue LED fast flashing

I wonder why that occasional loss/recovery is happening. And if that's the behavior that's causing me problems.

I'll assume a scenario where we're using the well-chosen defaults of the base transmitting data every second and the F9P dropping its RTK solution when it ages out at 60 seconds.

My experience is that in challenging situations it can take a 2-4 minutes to reestablish an RTK solution once lost.

And if the radio link is down for 20-30ish seconds,the F9P often doesn't calculate a new RTK solution before the 60 second age out.

If there's a loss of data to the rover for 20-30ish seconds every 4-5 minutes, the system usually doesn't have an RTK fixed solution and it is not usable.

To get back to my question up top, if "occasional loss/recovery" is every 5-10 minutes and "10 seconds" is closer to 20 seconds, the Facets generally won't have a stable RTK solution and they will not be usable.

It's also been my experience that I can not trust RTK Fixed solutions that are not stable. Bad fixes do happen.

This is all based on my memory experiences back in November and December, so take it with a grain of salt. I won't be geographically able to retest and confirm this for a month or so.

When I am able to test, I'll be very happy to help and test new LoRaSerial firmware. Testing the whole system (Facets and LoRaSerial radios) end-to-end inside on a workbench is not practical for me, so I need to go out in the field.

cturvey commented 8 months ago

Most of my observations are anecdotal at the moment, but the dropping looks to be precipitated by a packet drop-out (CRC) and HEARTBEAT Timeout. The rapid FIND_PARTNER strobing across channels seems ineffective, and it then recovers via the natural cycling of the channel hopping, and a success for a TX: FIND_PARTNER / RX: ACK-1

Some how I think it should be possible for the Stations to be more predictive of the Server's Hop Channel in the time-domain. In the one-to-many sense I don't think we want to be strafing the band with FIND_PARTNER requests, and instead "Scanning for servers" in a more passive sense, either listening for data packets or HEARTBEATs which are going to be occurring on a somewhat continuous basis.

The most practical way to do RTK is for the Server to just keep broadcasting, perhaps having two LoRaSerial 1W's with perhaps different bands, channels or spreading strategies. The stations could then be less powerful devices.

Or broadcasting a GPS Only RTCM3 subset once every 30 seconds at a prescribed channel. Knowing GPS ToW one could perhaps align at both ends, and modulo into the channel hop table if paranoid..

@tonycanike in your MULTIPOINT use case are you looking to back-haul position information back to the Server unit?

You can also push the RTK time-out on the ZED's out, As long as you're not losing carrier lock some maintained RTK FIXED/FLOAT solution is going to be significantly better than dumping to GNSS/DGNSS

cturvey commented 8 months ago

Misses HEARTBEAT, recovery via DISCOVERY fails, and waits for HEARTBEAT to cycle

RX: HEARTBEAT
Case #3, 0 Hops, 186 Nxt Hop - 44 (TX + RX) = 142 mSec
State: MP: Wait for TX or RX LinkUptime:     0:02:25
RX: HEARTBEAT
Case #3, 0 Hops, 51 Nxt Hop - 46 (TX + RX) = 5 mSec
State: MP: Wait for TX or RX LinkUptime:     0:02:30
HEARTBEAT Timeout
Lcl: 398, Rmt: 54 - 44 = 10 + 0 = 10 msToNextHop
Lcl: 238, Rmt: 212 - 44 = 168 + 0 = 168 msToNextHop
Lcl: 68, Rmt: 382 - 44 = 338 + 0 = 338 msToNextHop
Lcl: 192, Rmt: 261 - 44 = 217 + 0 = 217 msToNextHop
Lcl: 208, Rmt: 247 - 44 = 203 + 0 = 203 msToNextHop
Lcl: 216, Rmt: 237 - 44 = 193 + 0 = 193 msToNextHop
Lcl: 346, Rmt: 105 - 44 = 61 + 0 = 61 msToNextHop
Lcl: 70, Rmt: 382 - 44 = 338 + 0 = 338 msToNextHop
Lcl: 259, Rmt: 191 - 44 = 147 + 0 = 147 msToNextHop
Lcl: 290, Rmt: 163 - 44 = 119 + 0 = 119 msToNextHop
Lcl: 99, Rmt: 353 - 44 = 309 + 0 = 309 msToNextHop
Lcl: 96, Rmt: 358 - 44 = 314 + 0 = 314 msToNextHop
Lcl: 286, Rmt: 165 - 44 = 121 + 0 = 121 msToNextHop
Lcl: 68, Rmt: 382 - 44 = 338 + 0 = 338 msToNextHop
Lcl: 269, Rmt: 186 - 44 = 142 + 0 = 142 msToNextHop
Lcl: 5, Rmt: 51 - 46 = 5 + 0 = 5 msToNextHop
State: Disc: Setup for scanning
Start scanning
State: Disc: Scanning for servers
MP: SYNC_CLOCKS Timeout
TX: FIND_PARTNER
State: Disc: Wait for FIND_PARTNER to xmit
State: Disc: Scanning for servers
MP: SYNC_CLOCKS Timeout
TX: FIND_PARTNER
State: Disc: Wait for FIND_PARTNER to xmit
State: Disc: Scanning for servers
MP: SYNC_CLOCKS Timeout
TX: FIND_PARTNER
State: Disc: Wait for FIND_PARTNER to xmit
State: Disc: Scanning for servers
MP: SYNC_CLOCKS Timeout
TX: FIND_PARTNER
State: Disc: Wait for FIND_PARTNER to xmit
State: Disc: Scanning for servers
...
State: Disc: Wait for FIND_PARTNER to xmit
State: Disc: Scanning for servers
MP: SYNC_CLOCKS Timeout
TX: FIND_PARTNER
State: Disc: Wait for FIND_PARTNER to xmit
State: Disc: Scanning for servers
MP: SYNC_CLOCKS Timeout
State: Disc: Wait for Server HB
RX: HEARTBEAT
Case #3, 0 Hops, 382 Nxt Hop - 54 (TX + RX) = 328 mSec
    Channel Number: 0
Received HB, leaving DISCOVER standby
State: MP: Wait for TX or RX LinkUptime:     0:00:14
RX: HEARTBEAT
Case #3, 0 Hops, 285 Nxt Hop - 44 (TX + RX) = 241 mSec
State: MP: Wait for TX or RX LinkUptime:     0:00:18
RX: HEARTBEAT
Case #3, 0 Hops, 203 Nxt Hop - 44 (TX + RX) = 159 mSec
State: MP: Wait for TX or RX LinkUptime:     0:00:21
RX: HEARTBEAT
Case #3, 0 Hops, 131 Nxt Hop - 44 (TX + RX) = 87 mSec
State: MP: Wait for TX or RX LinkUptime:     0:00:26
RX: HEARTBEAT

Data CRC, HEARTBEAT timeout, successful quick recovery

Case #3, 0 Hops, 195 Nxt Hop - 44 (TX + RX) = 151 mSec
State: MP: Wait for TX or RX LinkUptime:     0:02:05
RX: HEARTBEAT
Case #2, 1 Hops, 1 Nxt Hop - 44 (TX + RX) + 400 Adj = 357 mSec
State: MP: Wait for TX or RX LinkUptime:     0:02:10
RX: Bad CRC-16, received 0x5204 expected 0xCABE
HEARTBEAT Timeout
Lcl: 338, Rmt: 110 - 44 = 66 + 0 = 66 msToNextHop
Lcl: 373, Rmt: 79 - 44 = 35 + 0 = 35 msToNextHop
Lcl: 350, Rmt: 104 - 44 = 60 + 0 = 60 msToNextHop
Lcl: 114, Rmt: 338 - 44 = 294 + 0 = 294 msToNextHop
Lcl: 95, Rmt: 355 - 44 = 311 + 0 = 311 msToNextHop
Lcl: 66, Rmt: 382 - 44 = 338 + 0 = 338 msToNextHop
Lcl: 276, Rmt: 174 - 44 = 130 + 0 = 130 msToNextHop
Lcl: 321, Rmt: 130 - 44 = 86 + 0 = 86 msToNextHop
Lcl: 55, Rmt: 395 - 44 = 351 + 0 = 351 msToNextHop
Lcl: 276, Rmt: 173 - 44 = 129 + 0 = 129 msToNextHop
Lcl: 295, Rmt: 159 - 44 = 115 + 0 = 115 msToNextHop
Lcl: 70, Rmt: 382 - 44 = 338 + 0 = 338 msToNextHop
Lcl: 189, Rmt: 260 - 44 = 216 + 0 = 216 msToNextHop
Lcl: 18, Rmt: 35 - 44 = -9 + 400 = 391 msToNextHop
Lcl: 258, Rmt: 195 - 44 = 151 + 0 = 151 msToNextHop
Lcl: 54, Rmt: 1 - 44 = -43 + 400 = 357 msToNextHop, timeToHop: 0, Hops: 1
State: Disc: Setup for scanning
Start scanning
State: Disc: Scanning for servers
MP: SYNC_CLOCKS Timeout
TX: FIND_PARTNER
...
State: Disc: Wait for FIND_PARTNER to xmit
State: Disc: Scanning for servers
RX: ACK-1
Case #3, 0 Hops, 290 Nxt Hop - 53 (TX + RX) = 237 mSec
    Channel Number: 26
State: MP: Wait for TX or RX LinkUptime:     0:00:00
RX: HEARTBEAT
Case #3, 0 Hops, 277 Nxt Hop - 44 (TX + RX) = 233 mSec
State: MP: Wait for TX or RX LinkUptime:     0:00:01
RX: HEARTBEAT
Case #3, 0 Hops, 207 Nxt Hop - 44 (TX + RX) = 163 mSec
State: MP: Wait for TX or RX LinkUptime:     0:00:04
RX: HEARTBEAT
Case #3, 0 Hops, 89 Nxt Hop - 44 (TX + RX) = 45 mSec
State: MP: Wait for TX or RX LinkUptime:     0:00:07
j-w-bullfrog commented 8 months ago

I can benchtest; aka see both radios diodes with them sitting next to each other, since my base rtk express unit sits in the basement attached by a 10 m signal line to a fixed antenna outside. Actually I also have a similar line for the radio antenna at the same point. About 20 ft away I have a daylight window that I can put the rtk express rover gps antenna in and enough cable to make the radios sit side by side so that I can watch the led's on both. I also have a sufficient # of radio antennas so that I don't have to use the one mounted 15 ft in the air.

tonycanike commented 8 months ago

@cturvey I am not looking to backhaul the rover position. As far as I know, no data is flowing from the rover to the base.

@j-w-bullfrog That bench testing would be awesome. Much harder to do with my Facets and the integrated antenna.

tonycanike commented 8 months ago

@cturvey Your thought about two radios at the base reminded me of a desire for a repeater mode. I opened a new issue, number #588.

j-w-bullfrog commented 8 months ago

If you get me test firmware I can flash, I can do functional testing. If not, I'll probably need some pointers to get it to compile. Both rtk express of course have sd cards, and I do have u-center & m-center.

cturvey commented 8 months ago

@j-w-bullfrog I'm currently looking at the dynamic interactions of the stand-alone LoRaSerial units, with a focus on MP (Multi-Point, One-to-Many). I'm bench testing here and getting the debug / telemetry to where I need it to be before connecting it to my existing base infrastructure.

The build process is described here. https://docs.sparkfun.com/SparkFun_LoRaSerial/firmware_build/ I've had it successfully build on IDE 1.8.x and IDE 2.x.x platforms Building and Running are two different hurdles, but the Library Versions are important. Some have leeway, others don't.

The issue with the timer being particularly difficult, as if you ATO or ATW the USB will disconnect, or repetitively connect-disconnect, and I had to open unit to recover.

Probably going to drop the spreading factor and coding rate to reduce the latency and air-time, and to more closely match my current deployment strategy.

cturvey commented 8 months ago

I don't like how the watchdog is kicked at all.

I really want to make one subroutine that kicks it and checks the channel hop. At the moment it does this same thing in hundreds of places. The hop should be around 400 ms, and the watchdog around 2 seconds. And there's places where is done fractions of a milli-second apart. We could check if millis() has even advanced. Several places where the kick is implied to take several milli-seconds, but that seems longer than the 32 KHz should take to sync, but perhaps prescalers also expand that time.

The WDT also has to sync with the 32.768 KHz as the WDT is on much slow clock domain, or the MCU will introduce a bus stall (basically stuff wait states, and out-to-lunch) https://github.com/javos65/WDTZero/blob/master/src/WDTZero.cpp#L89C20-L89C39

You can pretest the SYNCBUSY so the write will fall straight through

https://hackaday.io/project/20647-mightywatt-r3-70w-electronic-load-for-arduino/log/56143-found-and-fixed-a-bug-in-sketch-for-arduino-zero

cturvey commented 8 months ago

Nevermind, there is some mitigation of this https://github.com/sparkfun/SparkFun_LoRaSerial/blob/main/Firmware/LoRaSerial/Begin.ino#L74

Could still do something a bit different so it doesn't stall at all. Disappearing for 4-5 ms in this context is far from ideal, it'll break millis() I would think.

Still I think I'm going to make a CheckHopKickWatchdog() function

j-w-bullfrog commented 8 months ago

Interesting. I've seen where an esp / arduino (1.8?) will ignore an hardware interrupt connected subroutine to finish the current instruction. In this case it was a character write to a led screen. However the interrupt was generated by a zero crossing on the ac power line feeding power to a up to 2kw heater, so you could see the missed cycle. While not the same thing, I've seen quirks like this before.

That said, if other radios can perform using the supposedly same base s/w with the same chip set, what make this different?

cturvey commented 8 months ago

I can't speak to the ESP32, I'm working with the SAMD21 based radios, the v1.3 version of this guy in the plastic housing. https://learn.sparkfun.com/tutorials/loraserial-hookup-guide/all https://www.sparkfun.com/products/20029 I think Tony and I are looking at this from an RTCM3 Broadcaster, and potentially Re-Broadcaster, variant on MULTI-POINT

j-w-bullfrog commented 8 months ago

Yep, my bad on that. Those are the 1 w radios that I have; I just inherently fall into my own system perspective that data comes from the ZED, to the esp and sent to a rover esp zed, and the radios are just replacing the wifi or cellular to transmit that same data. Our app is rural and so wifi, and cellular are not reliable solutions for a 2 mile radius.

archielowen commented 8 months ago

Hey guys, I was brought here because I have the same issue, but I'm a newbie and I have no idea how to produce these reports you produce, program or anything you guys were discussing. I do field work regularly and can produce data if you want to test it. I'll keep an eye on this post and if you need a guinea pig just let me know.

cturvey commented 8 months ago

No worries, I've just enabled a bunch of the AT-Debug settings. Mostly

These are going to be unhelpful for normal operation, but I want to understand the interplay at failure.

I'm not entirely sold on the need for the END-POINT (AT-Server=0) devices needing/wanting to squawk at the SERVER across all channels. I see that as unnecessarily disruptive to the eco-system, and you'd need to be on the right band at the right time to get a response.

With LoRa, if you're not listening you're going to miss packets.

cturvey commented 8 months ago

Multi-Point Server that only sends (no RSSI LEDs) and Multi-Point Client that just listens, waits for the Heart Beat to loop back, and doesn't send partner requests https://github.com/cturvey/SparkFun_LoRaSerial/tree/tinker/Firmware/LoRaSerial

tonycanike commented 7 months ago

I'll try to bench test it this week. Just the radios, no RTK. My bench is in the basement and external GNSS antennas on Facets are not practical for me.

I imagine there are people with use cases for bi-directional multipoint configurations. If this is successful, perhaps this variant on multipoint could be a new distinct configuration option.

nseidle commented 7 months ago

Same here - I'm away from hardware this week but should be able to pick this up next week.

j-w-bullfrog commented 7 months ago

I'll try to bench test it this week. Just the radios, no RTK. My bench is in the basement and external GNSS antennas on Facets are not practical for me.

How would you bench test just the radios w/o rtk? I have a bench setup, but other than watching the led's and examining the u-blox file from the sd, I'm clueless, and it doesn't point to what might be the trouble, just a problem.

tonycanike commented 7 months ago

@j-w-bullfrog

How would you bench test just the radios w/o rtk?

I documented my bench test in the forum:

https://forum.sparkfun.com/viewtopic.php?f=116&t=60898

j-w-bullfrog commented 7 months ago

Tony, I've re-read your test procedure and it produces the same results that I got from using RTK in the setup, and what you have seen (weeks ago). I used both point to point, multipoint, both allowable speeds, and encryption disabled. It does help the radios to supply them with separate power.

cturvey commented 7 months ago

I haven't had much chance to work on this over the last week. I need to perhaps integrate some RTCM3 diagnostic output.

In terms of resyncing / recovery the channels could be reduced so it cycles quicker on 400 ms hops. Not sure of the feelings about cycling vs FCC compliance. LoRa already sweeps.

AT-NumberOfChannels=50
nseidle commented 7 months ago

50 channel min required for FCC compliance. 400ms could be lowered but not increased.

j-w-bullfrog commented 7 months ago

I'm going to demonstrate my lack of hwd / sw understanding by asking why we can't take the holybro firmware a d run it on the one watt LorA's. My understanding is that its the same chipset, probably different pinouts (config), and not inherently paired, but all i need is MORE power. So, how completely wrong am i? BTW, I do get the data I need perfectly from the holybro's

HighSpeedLowDrag1 commented 7 months ago

Multi-Point Server that only sends (no RSSI LEDs) and Multi-Point Client that just listens, waits for the Heart Beat to loop back, and doesn't send partner requests https://github.com/cturvey/SparkFun_LoRaSerial/tree/tinker/Firmware/LoRaSerial

Hi guys! I am very interested in what you are doing here and would like to help if I can. @cturvey Would there be a limit to the number of clients in this case? I have a need that requires more than 32 clients.

j-w-bullfrog commented 6 months ago

Any solutions to getting a working radio set? Thanks All.

nseidle commented 6 months ago

Tony's original issue is with Multipoint. We are still working on a more robust solution for multipoint. Use the radios in P2P mode for now.

j-w-bullfrog commented 6 months ago

As I stated previously, they don't work any better or differently in the other mode.

On Tue, Apr 2, 2024, 3:17 PM Nathan Seidle @.***> wrote:

Tony's original issue is with Multipoint. We are still working on a more robust solution for multipoint. Use the radios in P2P mode for now.

— Reply to this email directly, view it on GitHub https://github.com/sparkfun/SparkFun_LoRaSerial/issues/587#issuecomment-2033018871, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH2UE26QNLV4DNLBBJID4J3Y3MG5NAVCNFSM6AAAAABCTQI2V2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZTGAYTQOBXGE . You are receiving this because you were mentioned.Message ID: @.***>

tonycanike commented 4 months ago

For RTK work, I really want simplex one-way base-to-rover transmissions. The rover NEVER transmits any RF. The Base just transmits and sends data. The rover just receives. No ACKs. No retries. No handshaking. Keep it simple. Here's why:

  1. POWER - the base radio can transmit at a higher power with an external battery, the rover can be powered off the Facet and receive. We know the 1w radios can not be fully powered by the Facet radio port. The Facet radio port is (understandably) voltage and current limited. It's not hard to put a battery and make a custom power cable for the base radio to fully power the base radio. I find it surprisingly challenging to fully power the rover radio and not create nightmare for myself.

I spent 4 hours on my hands and knees crawling through dense autumn olive the other day with my Facet rover, pole, radio, and a weak JST connector cable. Understory brush, briars, multi-flora, poison ivy, barberry, hawthorns, autumn olive, and all the other crap out there is grabby, pokey, prickly, itchy, and totally annoying. The brush grabs at cables and external batteries and wants to eat them. This is the reality of surveying. Those pretty pictures of nicely-dressed happy smiling people with a rover on a manhole cover are total bs...well at least they don't correspond to my experience of surveying!

Having dangly cables, external batteries, and other choss on the rover pole is simply not a workable solution. The rover setup needs to be robust, clean, simple, and lightweight.

Being able to fully power the base radio with an external battery and minimally power the rover radios off the Facet would be a great solution. I tape up the JST cable to keep it in place. Of course, this means the base radio probably will not be able to receive transmissions from the rover radio, hence this post.

  1. DUTY CYCLE & BANDWIDTH - as others have points out above, if you're receiving you can't transmit. If you're transmitting you can't receive. It's just a waste of time for the base radio to receive. It's just a waste of time for the rover radio to transmit, and the rover probably then misses data from the base.

  2. HIGHER PROBABILITY OF WORKING - if the radios aren't relying on handshakes, some data is more likely to get from the base to the rover. If the rover misses a few packets, so what? No one cares. Next second new data comes down the pike. The old data is worthless now, stop trying to resend it. Retries, handshakes, and acknowledgments are a non-value-add "feature" in a base-rover RTK radio link.

I didn't invent the above - this is how the 450MHz band radios that most surveyors use work.

I, for one, have sadly put the SparkFun LoRaSerial radios into the "stuff that doesn't work" box. I really wanted to like them. I have some of the above power problems with the RFD900x radios I use, and I hate the "hard to see" LEDs on the RFD900x radios. The SparkFun radios have great LEDs.

Raulricardo23 commented 1 month ago

SOME NEWS so that the multipoint signal is not lost