sparkfun / SparkFun_RTK_Firmware

Centimeter precision GPS/GNSS using L1/L2 signals broadcast over Bluetooth SPP (using the ESP32) in an easy to use enclosure.
https://docs.sparkfun.com/SparkFun_RTK_Firmware/
Other
81 stars 46 forks source link

Log file corruption #345

Closed tonycanike closed 1 year ago

tonycanike commented 1 year ago

Subject of the issue

I'm seeing some log file corruption again. Four hour files gathered with V2.5 firmware, all constellations, 1 Hz.

Most of the corruption was early in the log file, probably during the time I was using WiFi AP to configure the Facet, but UBX_Integrity_Checker.py reported a few resyncs much later in the log file.

I did a quick check with the 10/1/22 RC firmware, see comment for details.

Your workbench

V2.5 Facets

Steps to reproduce

Tell us how to reproduce this issue. Please post any log files from serial output that may have been generated.

Expected behavior

Tell us what should happen

Actual behavior

Tell us what happens instead

tonycanike commented 1 year ago

Results using the 10/1/22 firmware. Looks like issues were early in the file.

C:\Users\canike\Carlson Projects\Jobs\P1014 Facet PPP\2022-10-03 Roost Table>python UBX_Integrity_Checker.py SFE_Facet_221003_131234.ubx UBX Integrity Checker

Could this file contain any NMEA messages? (Y/n): Do you want to repair the file? (y/N):

Processing SFE_Facet_221003_131234.ubx

NMEA $ found at byte 0. Attempting to process the message Panic!! UBX checksum error! Sync lost at byte 7719. Attemting to re-sync. Sync has been lost. Currently processing byte 7719. Rewinding to byte 7677 NMEA $ found at byte 7691. Attempting to process the message Sync successfully re-established at byte 7742. The NMEA message started at byte 7691

Panic!! NMEA checksum error! Sync lost at byte 7893. Attemting to re-sync Sync has been lost. Currently processing byte 7893. Rewinding to byte 7860 NMEA $ found at byte 7864. Attempting to process the message Sync successfully re-established at byte 7895. The NMEA message started at byte 7864

Panic!! NMEA checksum error! Sync lost at byte 12145. Attemting to re-sync Sync has been lost. Currently processing byte 12145. Rewinding to byte 12061 UBX Sync Char 1 (0xB5) found at byte 12109. Checking for Sync Char 2 Sync successfully re-established at byte 12156. The UBX message started at byte 12109

Panic!! UBX checksum error! Sync lost at byte 13448. Attemting to re-sync. Sync has been lost. Currently processing byte 13448. Rewinding to byte 12214 NMEA $ found at byte 12217. Attempting to process the message Panic!! Excessive NMEA message length! Sync lost at byte 12346. Attemting to re-sync NMEA $ found at byte 12417. Attempting to process the message Panic!! NMEA checksum error! Sync lost at byte 12526. Attemting to re-sync NMEA $ found at byte 12697. Attempting to process the message Panic!! NMEA checksum error! Sync lost at byte 12761. Attemting to re-sync UBX Sync Char 1 (0xB5) found at byte 12787. Checking for Sync Char 2 Panic!! Was expecting Sync Char 2 (0x62) but did not receive one! Sync lost at byte 12788. Attemting to re-sync NMEA $ found at byte 12947. Attempting to process the message Sync successfully re-established at byte 12990. The NMEA message started at byte 12947

NMEA $ found at byte 13075. Attempting to process the message Panic!! NMEA checksum error! Sync lost at byte 215552. Attemting to re-sync Sync has been lost. Currently processing byte 215552. Rewinding to byte 215492 NMEA $ found at byte 215636. Attempting to process the message Panic!! Excessive NMEA message length! Sync lost at byte 215765. Attemting to re-sync NMEA $ found at byte 215868. Attempting to process the message Sync successfully re-established at byte 215898. The NMEA message started at byte 215868

Processed 2407768 bytes File size was 2407768 Longest valid UBX message was 1240 bytes Longest valid NMEA message was 83 characters Message types and totals were: Message type: GNTXT Total: 10 Message type: 0x02 0x15 Total: 1556 Message type: GNRMC Total: 1556 Message type: GNGGA Total: 1555 Message type: GNGSA Total: 4687 Message type: GPGSV Total: 2493 Message type: GLGSV Total: 1726 Message type: GAGSV Total: 31 Message type: GBGSV Total: 22 Message type: GQGSV Total: 392 Message type: GNGST Total: 1556 Message type: 0x02 0x13 Total: 9506 Number of successful resyncs: 5

Bye!

C:\Users\canike\Carlson Projects\Jobs\P1014 Facet PPP\2022-10-03 Roost Table>

tonycanike commented 1 year ago

Today I got what appears to be a 4-hour uncorrupted log file.

I did this by setting the measurement rate to 1 Hz, only collecting GPS and GLONASS data, and not using the WiFi AP config.

I'll paste in the config file and UBX Integrity Checker output below.

-----------------------------------------

UBX Integrity Checker

Could this file contain any NMEA messages? (Y/n): Do you want to repair the file? (y/N):

Processing C:\Users\canike\Carlson Projects\Jobs\P1017 Eckle Static\20221003 Eckle Logging Clearing\SFE_Facet_221003_145712.ubx

NMEA $ found at byte 0. Attempting to process the message

Processed 32382018 bytes File size was 32382018 Longest valid UBX message was 1272 bytes Longest valid NMEA message was 83 characters Message types and totals were: Message type: GNTXT Total: 3 Message type: 0x02 0x15 Total: 16198 Message type: GNRMC Total: 16198 Message type: GNGGA Total: 16198 Message type: GNGSA Total: 48594 Message type: GPGSV Total: 24377 Message type: GLGSV Total: 22532 Message type: GQGSV Total: 4050 Message type: GNGST Total: 16198 Message type: 0x02 0x13 Total: 153078

Bye!

--------------------------------------------------

sizeOfSettings=4520 rtkIdentifier=38 rtkFirmwareVersion=v2.6-Oct 1 2022 zedFirmwareVersion=HPG 1.32 printDebugMessages=0 enableSD=1 enableDisplay=1 maxLogTime_minutes=1440 maxLogLength_minutes=1440 observationSeconds=60 observationPositionAccuracy=5.00 fixedBase=0 fixedBaseCoordinateType=0 fixedEcefX=-1280206.568 fixedEcefY=-4716804.403 fixedEcefZ=4086665.484 fixedLat=40.090294790 fixedLong=-105.185057610 fixedAltitude=1560.0890 dataPortBaud=460800 radioPortBaud=57600 surveyInStartingAccuracy=1.0 measurementRate=1000 navigationRate=1 enableI2Cdebug=0 enableHeapReport=0 enableTaskReports=0 dataPortChannel=0 spiFrequency=16 sppRxQueueSize=2048 sppTxQueueSize=512 dynamicModel=0 lastState=0 enableSensorFusion=0 autoIMUmountAlignment=1 enableResetDisplay=0 enableExternalPulse=1 externalPulseTimeBetweenPulse_us=900000 externalPulseLength_us=100000 externalPulsePolarity=1 enableExternalHardwareEventLogging=0 profileName=Profile1 enableNtripServer=0 ntripServer_StartAtSurveyIn=0 ntripServer_CasterHost=rtk2go.com ntripServer_CasterPort=2101 ntripServer_CasterUser= ntripServer_CasterUserPW= ntripServer_MountPoint=bldr_dwntwn2 ntripServer_MountPointPW=WR5wRo4H ntripServer_wifiSSID=TRex ntripServer_wifiPW=parachutes enableNtripClient=0 ntripClient_CasterHost=rtk2go.com ntripClient_CasterPort=2101 ntripClient_CasterUser= ntripClient_CasterUserPW= ntripClient_MountPoint=bldr_SparkFun1 ntripClient_MountPointPW=undefined ntripClient_wifiSSID=TRex ntripClient_wifiPW=parachutes ntripClient_TransmitGGA=1 serialTimeoutGNSS=1 pointPerfectDeviceProfileToken= enablePointPerfectCorrections=1 home_wifiSSID= home_wifiPW= autoKeyRenewal=1 pointPerfectClientID= pointPerfectBrokerHost= pointPerfectLBandTopic= pointPerfectCurrentKey= pointPerfectCurrentKeyDuration=0 pointPerfectCurrentKeyStart=0 pointPerfectNextKey= pointPerfectNextKeyDuration=0 pointPerfectNextKeyStart=0 lastKeyAttempt=0 updateZEDSettings=0 LBandFreq=1556290000 enableLogging=1 timeZoneHours=0 timeZoneMinutes=0 timeZoneSeconds=0 enablePrintWifiIpAddress=1 enablePrintState=0 enablePrintWifiState=0 enablePrintNtripClientState=0 enablePrintNtripServerState=0 enablePrintPosition=0 enableMarksFile=0 enablePrintBatteryMessages=1 enablePrintRoverAccuracy=1 enablePrintBadMessages=0 enablePrintLogFileMessages=0 enablePrintLogFileStatus=1 enablePrintRingBufferOffsets=0 enablePrintNtripServerRtcm=0 enablePrintNtripClientRtcm=0 enablePrintStates=1 enablePrintDuplicateStates=0 radioType=0 espnowPeerCount=0 enableRtcmMessageChecking=0 bluetoothRadioType=0 enableNmeaClient=0 enableNmeaServer=0 espnowBroadcast=1 antennaHeight=0 antennaReferencePoint=0.00 echoUserInput=1 constellation.GPS=1 constellation.SBAS=1 constellation.Galileo=0 constellation.BeiDou=0 constellation.QZSS=1 constellation.GLONASS=1 message.UBX_NMEA_DTM.msgRate=0 message.UBX_NMEA_GBS.msgRate=0 message.UBX_NMEA_GGA.msgRate=1 message.UBX_NMEA_GLL.msgRate=0 message.UBX_NMEA_GNS.msgRate=0 message.UBX_NMEA_GRS.msgRate=0 message.UBX_NMEA_GSA.msgRate=1 message.UBX_NMEA_GST.msgRate=1 message.UBX_NMEA_GSV.msgRate=4 message.UBX_NMEA_RMC.msgRate=1 message.UBX_NMEA_VLW.msgRate=0 message.UBX_NMEA_VTG.msgRate=0 message.UBX_NMEA_ZDA.msgRate=0 message.UBX_NAV_ATT.msgRate=0 message.UBX_NAV_CLOCK.msgRate=0 message.UBX_NAV_DOP.msgRate=0 message.UBX_NAV_EOE.msgRate=0 message.UBX_NAV_GEOFENCE.msgRate=0 message.UBX_NAV_HPPOSECEF.msgRate=0 message.UBX_NAV_HPPOSLLH.msgRate=0 message.UBX_NAV_ODO.msgRate=0 message.UBX_NAV_ORB.msgRate=0 message.UBX_NAV_POSECEF.msgRate=0 message.UBX_NAV_POSLLH.msgRate=0 message.UBX_NAV_PVT.msgRate=0 message.UBX_NAV_RELPOSNED.msgRate=0 message.UBX_NAV_SAT.msgRate=0 message.UBX_NAV_SIG.msgRate=0 message.UBX_NAV_STATUS.msgRate=0 message.UBX_NAV_SVIN.msgRate=0 message.UBX_NAV_TIMEBDS.msgRate=0 message.UBX_NAV_TIMEGAL.msgRate=0 message.UBX_NAV_TIMEGLO.msgRate=0 message.UBX_NAV_TIMEGPS.msgRate=0 message.UBX_NAV_TIMELS.msgRate=0 message.UBX_NAV_TIMEUTC.msgRate=0 message.UBX_NAV_VELECEF.msgRate=0 message.UBX_NAV_VELNED.msgRate=0 message.UBX_RXM_MEASX.msgRate=0 message.UBX_RXM_RAWX.msgRate=1 message.UBX_RXM_RLM.msgRate=0 message.UBX_RXM_RTCM.msgRate=0 message.UBX_RXM_SFRBX.msgRate=1 message.UBX_MON_COMMS.msgRate=0 message.UBX_MON_HW2.msgRate=0 message.UBX_MON_HW3.msgRate=0 message.UBX_MON_HW.msgRate=0 message.UBX_MON_IO.msgRate=0 message.UBX_MON_MSGPP.msgRate=0 message.UBX_MON_RF.msgRate=0 message.UBX_MON_RXBUF.msgRate=0 message.UBX_MON_RXR.msgRate=0 message.UBX_MON_TXBUF.msgRate=0 message.UBX_TIM_TM2.msgRate=0 message.UBX_TIM_TP.msgRate=0 message.UBX_TIM_VRFY.msgRate=0 message.UBX_RTCM_1005.msgRate=0 message.UBX_RTCM_1074.msgRate=0 message.UBX_RTCM_1077.msgRate=0 message.UBX_RTCM_1084.msgRate=0 message.UBX_RTCM_1087.msgRate=0 message.UBX_RTCM_1094.msgRate=0 message.UBX_RTCM_1097.msgRate=0 message.UBX_RTCM_1124.msgRate=0 message.UBX_RTCM_1127.msgRate=0 message.UBX_RTCM_1230.msgRate=0 message.UBX_RTCM_4072_0.msgRate=0 message.UBX_RTCM_4072_1.msgRate=0 message.UBX_ESF_MEAS.msgRate=0 message.UBX_ESF_RAW.msgRate=0 message.UBX_ESF_STATUS.msgRate=0 message.UBX_ESF_ALG.msgRate=0 message.UBX_ESF_INS.msgRate=0

-------------------------------------

Eric-FR commented 1 year ago

I have the same issue with the RTK Express. And checking back my files, it seems not being so new. v1.11-Mar 14 2022 No problem v1.11-Apr 5 2022 Problem v2.2 Problem v2.5 Problem (although I may have changed acquisition parameters at the same time...).

Attached file faced 78 resync with UBX Integrity Checker:

SFE_Express_220730_135955.zip

nseidle commented 1 year ago

@tonycanike - As always, thanks for reporting

@Eric-FR - This is very good feedback as it helps us pinpoint, via the changelog, what we broke. The only relevant change I see is a change to serialTimeoutGNSS from 0 to 1ms in v1.12.

There are many variables that we need to test so I welcome help from any reader. This is not the first time this issue has been reported. @pyrog - I know you've been reporting this type of error over the months and releases. We (at SparkFun) have compiled a tremendous amount of testing data and have insights but I worry my preconceived notions will only hinder finding what's causing this issue.

How to run a test:

Tests that need to be done (I welcome others):

I believe if a corruption is going to happen, it happens within ~15 minutes so I recommend running short tests before running an overnight test.

My office test setup is an RTK Express with an SMA extension running to a fixed antenna on the roof. If you have a Facet, my apologies but this will be tedious (at least you'll get outdoors!). If anyone needs an roof antenna or SMA extension, please email me - I'm happy to provide equipment.

Eric-FR commented 1 year ago

I never buy memory cards online but always go to physical well known shops near me, to avoid fake cards.

I will start some testing this week-end.

pyrog commented 1 year ago

Workbench

File Size (bytes) Resync
SFE_Surveyor_221016_075333.ubx 2349025 3
SFE_Surveyor_221016_092833.ubx 13913100 75
SFE_Surveyor_221016_112701.ubx 3532900 22
SFE_Surveyor_221016_114843.ubx 12873467 73
pyrog commented 1 year ago
File Size (bytes) Resync
SFE_Surveyor_221016_164048.ubx 3945046 23
tonycanike commented 1 year ago

I ran a couple tests with the 10/13/ RC

The first test was about 4.5 hours. Receiver running autonomously (no RTCM.) Just turned on the Facet and let it log. Got two resyncs very early on in the UBX file.

The second test was about 5.5 hours. RTK rover getting NTRIP from NY state DOT RTN servers via SurvPC (which was configured to treat the Facet like a UBLOX device, not simply a generic NMEA receiver.) Usually fixed solution but sometimes had to restart the NTRIP feed.

Got five resyncs and a number of messages well throughout the file. Perhaps receiving the NTRIP feed consumed extra resources.

first test (autonomous)

Start of log file: $GNTXT,01,01,00,ESP_RST_POWERON1C $GNTXT,01,01,03,v2.6-Oct 1 202268 $GNTXT,01,01,04,HPG 1.32*34

Analysis output:

C:....>python UBX_Integrity_Checker.py SFE_Facet_221015_172738.ubx UBX Integrity Checker

Could this file contain any NMEA messages? (Y/n): Do you want to repair the file? (y/N):

Processing SFE_Facet_221015_172738.ubx

NMEA $ found at byte 0. Attempting to process the message Panic!! Excessive NMEA message length! Sync lost at byte 43107. Attemting to re-sync Sync has been lost. Currently processing byte 43108. Rewinding to byte 42979 NMEA $ found at byte 43415. Attempting to process the message Sync successfully re-established at byte 43445. The NMEA message started at byte 43415

Panic!! UBX checksum error! Sync lost at byte 96257. Attemting to re-sync. Sync has been lost. Currently processing byte 96257. Rewinding to byte 96215 NMEA $ found at byte 96310. Attempting to process the message Sync successfully re-established at byte 96340. The NMEA message started at byte 96310

Processed 52667387 bytes File size was 52667387 Longest valid UBX message was 1944 bytes Longest valid NMEA message was 83 characters Message types and totals were: Message type: GNTXT Total: 23 Message type: 0x02 0x15 Total: 18584 Message type: GNRMC Total: 18584 Message type: GNGGA Total: 18584 Message type: GNGSA Total: 92850 Message type: GNGST Total: 18583 Message type: 0x02 0x13 Total: 257000 Message type: GPGSV Total: 27021 Message type: GLGSV Total: 22674 Message type: GQGSV Total: 4645 Message type: GAGSV Total: 25247 Message type: GBGSV Total: 17377 Number of successful resyncs: 2

Bye!

second test (RTK rover)

Start of log file: $GNTXT,01,01,00,ESP_RST_POWERON1C $GNTXT,01,01,03,v2.6-Oct 13 20227B $GNTXT,01,01,04,HPG 1.3234 $GNTXT,01,01,07,4DAE22

Analysis output:

C:....>python UBX_Integrity_Checker.py SFE_Facet_221017_125247.ubx UBX Integrity Checker

Could this file contain any NMEA messages? (Y/n): Do you want to repair the file? (y/N):

Processing SFE_Facet_221017_125247.ubx

NMEA $ found at byte 0. Attempting to process the message NMEA $ found at byte 642936. Attempting to process the message NMEA $ found at byte 652389. Attempting to process the message Panic!! Excessive NMEA message length! Sync lost at byte 652518. Attemting to re-sync Sync has been lost. Currently processing byte 652519. Rewinding to byte 652390 NMEA $ found at byte 652485. Attempting to process the message Panic!! NMEA checksum error! Sync lost at byte 652557. Attemting to re-sync UBX Sync Char 1 (0xB5) found at byte 652652. Checking for Sync Char 2 Panic!! Was expecting Sync Char 2 (0x62) but did not receive one! Sync lost at byte 652653. Attemting to re-sync NMEA $ found at byte 652832. Attempting to process the message Panic!! Excessive NMEA message length! Sync lost at byte 652961. Attemting to re-sync NMEA $ found at byte 653056. Attempting to process the message Panic!! Excessive NMEA message length! Sync lost at byte 653185. Attemting to re-sync UBX Sync Char 1 (0xB5) found at byte 653389. Checking for Sync Char 2 Sync successfully re-established at byte 653398. The UBX message started at byte 653389

UBX Sync Char 1 (0xB5) found at byte 664432. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 667515. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 680250. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 799941. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 813004. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 847028. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 854353. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 932348. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 937840. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 949298. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 1024086. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 1196374. Checking for Sync Char 2 NMEA $ found at byte 1201139. Attempting to process the message Panic!! NMEA checksum error! Sync lost at byte 1201201. Attemting to re-sync Sync has been lost. Currently processing byte 1201201. Rewinding to byte 1201140 NMEA $ found at byte 1201211. Attempting to process the message Panic!! Excessive NMEA message length! Sync lost at byte 1201340. Attemting to re-sync UBX Sync Char 1 (0xB5) found at byte 1201523. Checking for Sync Char 2 Sync successfully re-established at byte 1201548. The UBX message started at byte 1201523

UBX Sync Char 1 (0xB5) found at byte 1217017. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 1234149. Checking for Sync Char 2 Panic!! UBX checksum error! Sync lost at byte 1275922. Attemting to re-sync. Sync has been lost. Currently processing byte 1275922. Rewinding to byte 1275432 NMEA $ found at byte 1275623. Attempting to process the message Panic!! Excessive NMEA message length! Sync lost at byte 1275752. Attemting to re-sync UBX Sync Char 1 (0xB5) found at byte 1275772. Checking for Sync Char 2 Sync successfully re-established at byte 1275787. The UBX message started at byte 1275772

UBX Sync Char 1 (0xB5) found at byte 1294479. Checking for Sync Char 2 NMEA $ found at byte 1336852. Attempting to process the message Panic!! NMEA checksum error! Sync lost at byte 1336905. Attemting to re-sync Sync has been lost. Currently processing byte 1336905. Rewinding to byte 1336853 UBX Sync Char 1 (0xB5) found at byte 1336853. Checking for Sync Char 2 Sync successfully re-established at byte 1336868. The UBX message started at byte 1336853

UBX Sync Char 1 (0xB5) found at byte 1358448. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 1482487. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 1576053. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 2155790. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 2169259. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 2174035. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 2856681. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 2927084. Checking for Sync Char 2 Panic!! UBX checksum error! Sync lost at byte 2998399. Attemting to re-sync. Sync has been lost. Currently processing byte 2998399. Rewinding to byte 2998373 UBX Sync Char 1 (0xB5) found at byte 2998395. Checking for Sync Char 2 Sync successfully re-established at byte 2998426. The UBX message started at byte 2998395

UBX Sync Char 1 (0xB5) found at byte 3000616. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 3276548. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 3277047. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 3281735. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 3283207. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 3283691. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 3285672. Checking for Sync Char 2

Processed 15854564 bytes File size was 15854564 Longest valid UBX message was 1400 bytes Longest valid NMEA message was 79 characters Message types and totals were: Message type: GNTXT Total: 5 Message type: 0x02 0x13 Total: 328092 Message type: 0x02 0x15 Total: 261 Message type: GNRMC Total: 261 Message type: GNGGA Total: 261 Message type: GNGSA Total: 1305 Message type: GPGSV Total: 431 Message type: GLGSV Total: 305 Message type: GAGSV Total: 319 Message type: GBGSV Total: 149 Message type: GQGSV Total: 66 Message type: GNGST Total: 262 Message type: 0x06 0x13 Total: 13 Message type: 0x05 0x01 Total: 679 Message type: 0x0A 0x06 Total: 2 Message type: 0x05 0x00 Total: 30 Message type: 0x06 0x8B Total: 64 Message type: 0x06 0x00 Total: 16 Message type: 0x0A 0x04 Total: 2 Message type: 0x06 0x71 Total: 2 Message type: 0x01 0x04 Total: 2241 Message type: 0x01 0x07 Total: 2235 Message type: 0x01 0x14 Total: 2210 Message type: 0x01 0x35 Total: 1119 Message type: 0x02 0x32 Total: 562 Number of successful resyncs: 5

Bye!

Eric-FR commented 1 year ago

Summary: v2.6 has errors with 8 messages enabled

Yesterday, I made a mountain bike trip of 4 hours with the RTK Express and last firmware update : $GNTXT,01,01,03,v2.6-Oct 13 2022*7B

I got a lot of errors:

murier_integrity

Processing SFE_Express_221016_102210.ubx

NMEA $ found at byte 0. Attempting to process the message
Panic!! UBX checksum error!
Sync lost at byte 185544. Attemting to re-sync.
Sync has been lost. Currently processing byte 185544. Rewinding to byte 184910
UBX Sync Char 1 (0xB5) found at byte 185103. Checking for Sync Char 2
Panic!! Was expecting Sync Char 2 (0x62) but did not receive one!
Sync lost at byte 185104. Attemting to re-sync
NMEA $ found at byte 185220. Attempting to process the message
Panic!! NMEA checksum error!
Sync lost at byte 185319. Attemting to re-sync
UBX Sync Char 1 (0xB5) found at byte 185331. Checking for Sync Char 2
Panic!! Was expecting Sync Char 2 (0x62) but did not receive one!
Sync lost at byte 185332. Attemting to re-sync
UBX Sync Char 1 (0xB5) found at byte 185504. Checking for Sync Char 2
Sync successfully re-established at byte 185559. The UBX message started at byte 185504
[...]
Panic! Maximum rewind attempts reached! Aborting...

Processed 3066688 bytes
File size was 69441916
FILE SIZE MISMATCH!!
Longest valid UBX message was 1816 bytes
Longest valid NMEA message was 86 characters
Message types and totals were:
Message type: GNTXT   Total: 4
Message type: 0x02 0x15   Total: 622
Message type: 0x01 0x07   Total: 652
Message type: 0x01 0x02   Total: 654
Message type: GNRMC   Total: 654
Message type: GNGGA   Total: 653
Message type: GNGSA   Total: 3269
Message type: GPGSV   Total: 4367
Message type: GLGSV   Total: 3207
Message type: GAGSV   Total: 3197
Message type: GBGSV   Total: 3412
Message type: GQGSV   Total: 656
Message type: GNGST   Total: 654
Message type: 0x01 0x35   Total: 640
Message type: 0x02 0x13   Total: 8216
Number of successful resyncs: 101

Bye!

Settings and full report attached:

SFE_Express_Settings_2.txt murier_integrity.txt

nseidle commented 1 year ago

Thanks all! There is no need to copy paste the output from the UBX checker; all we need to know is what test you attempted (ex: v1.11 firmware with 7 messages enabled shows errors) and a summary of the integrity checker (success or number or resyncs) will suffice.

From my own notes:

Currently I don't see a firmware that doesn't have errors. v1.11 produces errors similar to v2.6. However, reducing to 1Hz allows error free logging (a few others have seen this as well). But I have a few more ideas to try to get fewer (hopefully zero) errors.

pyrog commented 1 year ago

Cause of the log corruption seem data race for me.

Does reading and writing to F9P buffer are atomic? Same for writing to SD card?

Some advices in this post about FreeRTOS mutex and critical section 😃

pyrog commented 1 year ago

Note: This issue is related to log corruption, regardless of firmware version and message format.

Some examples of corrupted NMEA sentences (regex search ^(?!\$G)):

4GPRMC,115344.50,A,4558.4524745,N,00621.0063101,E,1.992,257.45,161022,,,R,V*0D
29,23,255,46,32,48,087,50,1*7D
GBGSV,2,2,07,07,02,075,,10,01,087,,25,05,031,,0*48
,12,0C,121022.50,A,4558.2663623,N,00620.5156245,E,1.555,39.83,161022,,,R,V*39
.65,MC,121237.75,A,4558.2147219,N,00620.5075217,E,0.084,,161022,,,R,V*14
4,12MC,122127.75,A,4558.0947462,N,00620.4406795,E,1.582,200.87,161022,,,R,V*0A
49,PGSV,2,2,05,26,00,021,,0*57
6
4,12MC,125658.75,A,4557.9857028,N,00619.9455904,E,2.110,337.59,161022,,,R,V*03
tonycanike commented 1 year ago

Note: This issue is related to log corruption, regardless of firmware version and message format.

Agreed, I updated the issue title.

nseidle commented 1 year ago

@pyrog - Thanks! You're asking a nerd about his code, I'd love to share! :) I appreciate your thoughts. Have a look at the older version of the code (as it is a lot simpler to follow):

//Attempt to write to file system. This avoids collisions with file writing from other functions like recordSystemSettingsToFile()
if (xSemaphoreTake(sdCardSemaphore, fatSemaphore_shortWait_ms) == pdPASS)
{
  ubxFile.write(rBuffer, s);

  xSemaphoreGive(sdCardSemaphore);
} //End sdCardSemaphore
else
{
  log_d("sdCardSemaphore failed to yield, %s line %d\r\n", __FILE__, __LINE__);
}

Any access to the SD card is through a semaphore meaning only one thing can talk to the SD card at a time, thus I believe all writes are atomic. The ESP32 is dual core and runs FreeRTOS (real time operating system). In short, this means various tasks can run simultaneously and we don't know ahead if time (really) what task will be accessing which piece of hardware. The semaphore prevents other tasks from having the SPI/SD card open at the same time we trying to record log data.

The RTOS comes to a screeching halt when multiple things try to access the same hardware so we are confident the semaphore is working as expected.

The current version of the code that writes to the SD card is more complex because it implements a circular buffer to catch all incoming data from the ZED-F9P incase the semaphore cannot be unlocked in the given amount of time.

pyrog commented 1 year ago

@nseidle (just after posting my previous message) I saw only one mutex, for gaining access to the SD card 😄 When reading several articles about data race, I understood that the issue could also be located at the input buffer. The buffer containing the F9P message could be overwrite (by another task or an interruption) when the buffer is currently parsed.

So even with a Semaphore on the SD Card, this "corrupted" buffer could be "safely" written in the log file. I don't dig into your code, but I don't see another semaphore to protect the input buffer. Even with a real time OS and a single core processor, data race could happened.

But maybe the issue in the new code is that the circular buffer is too small (or the code for logging is too slow) ?

pyrog commented 1 year ago

Does accessing i.e. online.logging is atomic ? What happened if wifiNmeaData() or bluetoothWriteBytes() are too long ?

ButtonCheckTask as the same priority than F9PSerialReadTask or F9PSerialWriteTask (1. 3 being the highest, and 0 being the lowest) Does F9PSerialReadTask could have a higher priority than ButtonCheckTask?

Eric-FR commented 1 year ago

RTK Express $GNTXT,01,01,03,v2.6-Oct 13 20227B $GNTXT,01,01,04,HPG 1.3234 $GNTXT,01,01,07,181E*2B

New µSD card. Factory reset.

5 default NMEA messages (4 Hz)

30 mn recording with RTK correction through BT: more than 100 rewind. 15 mn recording without RTK: integrity OK. Again 15 mn recording with RTK/BT: more than 100 rewind. 15 mn recording without RTK: integrity OK.

Eric-FR commented 1 year ago

New test this morning suggested by pyrog. Same as previous. BT connexion with SW_Maps to get the position on the smartphone but without sending RTK corrections to the Express. 15 mn recording: integrity OK.

pyrog commented 1 year ago

Recording 14 mn of NMEA at 4 Hz.

Internal NTRIP client on. Wifi AP connected but smartphone disconnected from internet: the device can't receive RTCM.

File Size (bytes) Resync
SFE_Surveyor_221022_161511.ubx 2504304 1

The SWAIR project NMEA Analyser report:

2 invalid lines
  1 bad header and termination
  1 invalid checksums

Bad header and termination: 44,44,10,29,053,46,3*76
Invalid Checksum: $GLGSV,2,2,06,79,38,041,49,88,09,028,29*4C

Recording 18 mn of NMEA at 4 Hz.

Internal NTRIP client on. Wifi AP connected and smartphone connected to internet: the device receive RTCM.

File Size (bytes) Resync
SFE_Surveyor_221024_091005.ubx 3413937 18

The SWAIR project NMEA Analyser report:

20 invalid lines
  2 bad header and termination
  18 invalid checksums

Bad header and termination: .65,584091140.25,A,4623.5987283,N,00547.5779868,E,0.007,,241022,,,F,V*02
Invalid Checksum: $GNGSA,A,1,,,,,,,,,,,,,99.9,99.99,99.99,4*36
Invalid Checksum: $GPGGA,091140.25,4623.5987283,N,00547.5779868,E,5,12,0.65,584915,M,47.230,M,2.3,0002*7C
Invalid Checksum: $GPGSV,1,1,02,0739,,177,,11,06,314,,0*60
Invalid Checksum: $GLGSV,2,2,08,78,12,098,49,83,39,0,41,84,63,295,29,85,09,335,24,1*76
Invalid Checksum: $GPGSV,2,1,07,01,24,145,42,03,48,079,48,04,81,109,47,06,39,3SV,34,6*6C
Invalid Checksum: $GLGSV2,1,08,67,11,056,40,68,57,031,40,69,57,271,38,77,17,043,23,1*7D
Bad header and termination: .54,MC,091755.75,A,4623.5988755,N,00547.5784830,E,0.017,,241022,,,F,V*0A
Invalid Checksum: $GPGGA,091755.75,4623.5988755,N,00547.5784830,E,5,12,0.54,86.077,M,47.230,M,0.8,0002*71
Invalid Checksum: $GLGSV,2,1,08,67,09,057,39,68,55,033,31,69,58,275,38,77,17,041,27,3*73
Invalid Checksum: $GLGSV,1,1,02,70,02,248,,76,02,35,0*7D
Invalid Checksum: $GPGSA,A,3,78,77769,84,83,85,68,,,,,,1.06,0.54,0.91,2*16
Invalid Checksum: $GPGSA,A,3,28,06,14,24,25,09,2,,,,,1.06,0.54,0.91,4*10
Invalid Checksum: $GPGSA,A,3,78,777,,,,,,83,85,68,,,,,,1.06,0.54,0.91,2*16
Invalid Checksum: $GPGSA,A,3,28,06,14,24,25,09,27,,,,,.06,0.54,0.91,4*10
Invalid Checksum: $GPGGA,092335.75,4623.5993197,N,00547.5772854,3,85,680.54,585.495,M,47.230,M,1.8,0002*73
Invalid Checksum: $GPGSA,A,3,78,77,69,,,,1.,85,68,,,,,,1.14,0.60,0.96,2*15
Invalid Checksum: $GPGSA,A,3,28,06,14,24,25,09,27,,,,,,1.,0.60,0.96,4*13
Invalid Checksum: $GAGSV,2,1,06,47,21,054,43,10,27,315,26,12,51,303,44,19,57,229,36,2*77
Invalid Checksum: $GAGSV,1,1,04,07,21,054,4,12,51,303,35,26,17,152,47,33,63,181,37,7*7B

Note: UBX_Integrity_Checker don't seem to detect Invalid NMEA Checksum.

pyrog commented 1 year ago

Log simultaneously F9P USB output with a laptop and RTK Surveyor log to SD Card.

Duration 5 mn.

FP9 USB output

SFE_Surveyor_221024_120758.nmea The SWAIR project NMEA Analyser report:

1 invalid lines
  1 bad header and termination
  0 invalid checksums
  0 corrupted messages
16988 detected messages
  601 GGA
  2404 GSA
  12181 GSV
  601 RMC
  600 GLL
  0 ZDA
  601 VTG
  0 Not supported
16989 TOTAL LINES in log

The corrupted line is the last line (the device was powered off).

UBX_Integrity_Checker can't check the file.

Panic!! NMEA CR not found!
Sync lost at byte 1023714. Attemting to re-sync
NMEA $ found at byte 1023715. Attempting to process the message

Processed 1023764 bytes
File size was 1023764
Longest valid UBX message was 0 bytes
Longest valid NMEA message was 0 characters

SD Card log file

SFE_Surveyor_221024_120758.ubx The SWAIR project NMEA Analyser report:

0 invalid lines
  0 bad header and termination
  0 invalid checksums
  0 corrupted messages
14889 detected messages
  1228 GGA
  4912 GSA
  6289 GSV
  1228 RMC
  0 GLL
  0 ZDA
  0 VTG
  1232 Not supported
14890 TOTAL LINES in log

UBX_Integrity_Checker

Processed 936789 bytes
File size was 936789
Longest valid UBX message was 0 bytes
Longest valid NMEA message was 85 characters
Message types and totals were:
Message type: GNTXT   Total: 4
Message type: GNRMC   Total: 13
Message type: GNGGA   Total: 13
Message type: GNGSA   Total: 52
Message type: GNGST   Total: 13
Message type: GPGSV   Total: 1988
Message type: GLGSV   Total: 1535
Message type: GAGSV   Total: 1538
Message type: GBGSV   Total: 1228
Message type: GPRMC   Total: 1215
Message type: GPGGA   Total: 1215
Message type: GPGSA   Total: 4860
Message type: GPGST   Total: 1215

Visual difference beetween 2 files

Diff 2 Diff

When the Surveyor compute and log GNGST messages, following messages aren't logged to the SD Card: GNGLL GPGSV GLGSV GAGSV GBGSV GNVTG

Note: This is not systematic

nseidle commented 1 year ago

Just an update: We are working a lot on this issue. There's a lot learned and discovered but we don't have a flawless solution yet.

@pyrog - Thank you for the comparisons!

Note: UBX_Integrity_Checker don't seem to detect Invalid NMEA Checksum.

I believe it does.

following messages aren't logged to the SD Card

Good! You are doing a lot of what we are: comparing the ZED's USB output to the SD card. They should be identical, but only if you have configured the USB interface on the ZED properly. This is not trivial. You can use u-center but you need to modify a lot of options. Below is an example: the GSV message is decimated to once per second on the UART interface, but is 4Hz on the USB interface.

image

This is complicated enough we wrote a helper function in the debug menu. 17 will duplicated the UART settings to the USB interface:

image

Only after this has been set should you compare USB output from the ZED to the SD logs. There's still discrepancies but you will see far fewer:

image

pyrog commented 1 year ago

Note: UBX_Integrity_Checker don't seem to detect Invalid NMEA Checksum.

Oops! it doesn't detect "bad header and termination" (see test with SFE_Surveyor_221024_091005.ubx) This is probably normal because this tool is used to parsed primarily UBX, not NMEA.

They should be identical, but only if you have configured the USB interface on the ZED properly.

You're right. I thought that the device configured automatically both F9P outputs. At least for me, one interest of Sparkfun RTK device is to avoid to use u-Center for setting the F9P.

This is not trivial.

👍

You can use u-center but you need to modify a lot of options. Below is an example: the GSV message is decimated to once per second on the UART interface, but is 4Hz on the USB interface.

Manual settings with U-Center are useful for fine tuning the f9p, but with your product I thing we shouldn't use it.

Anyway, the Sparkfun device should log to the SD card exactly what it receive from the f9p.

This is complicated enough we wrote a helper function in the debug menu. 17 will duplicated the UART settings to the USB interface: Only after this has been set should you compare USB output from the ZED to the SD logs

Okay, I will use this option in next tests 😃

nseidle commented 1 year ago

tldr; Please load the new Oct26 RC, format your SD card using SD Formatter, and try logging at 4Hz with 7 messages on (or whatever test you'd like to run). It should work much better. Please report back any log corruptions or issues.

For those who are interested, turn on buffer overrun printing: main menu-> 's' -> 'd' -> 33

image

Longer story:

The ESP32 uses a 127 byte UART FIFO. As this buffer crosses 120 bytes, an interrupt occurs and the FIFO contents are transferred to the 'UART receive' buffer. This 'UART receive' buffer was previously defined to be 4096 bytes. We used a RTOS task called F9PSerialReadTask() that would read in data from the UART and then write it to the SD card. The root cause of the log corruption was because the SD write times vary based on all sorts of things(*) out of our control, but in general, vary from 15ms to ~250ms (or more). The baud rate of UART2 was 460,800bps or 46,080 bytes per second or a byte every 21.7 microseconds. If the SD blocks for 250ms that's 11,520 bytes coming in. The UART buffer of 4096 bytes would quickly overflow and data was lost. Note, most SD writes are 5-10ms (~460bytes) so we didn't see corruption often, only when the SD card needed a minute to breathe. Additionally, the FIFO buffer is limited to 127 bytes as was by default being serviced only after 120 bytes was received. This worked fine until other interrupts needed servicing and the FIFO would overrun (see fix 2).

Users who are logging at a 1Hz fix rate (1/4 of 11520 = 2880 bytes per second) or with only NMEA sentences enabled (100s of bytes per second) would rarely, if ever, see a log corruption because of the lower data rates begin used. Here is a list of the changes we've made:

Fix 1: Reduce baud rate to 230,400bps. This slows incoming data from the ZED into the ESP32 so now we only have to deal with a max of 5760 bytes coming in during a 250ms block. Note: The downside to this is that higher fix rates (ie 10Hz) will need to operate at a higher baud rate, and thus are more likely to run into logging errors. I don't know of any users operating at >4Hz but we can cross that bridge when it's requested.

Fix 2: Reduce the FIFO threshold. The default is 120 out of 127 bytes. This means the UART ISR must be attended to within ~300us before data is lost. Decreasing this threshold to 50 bytes allows other, higher priority interrupts to occur, then the UART ISR before the FIFO overruns.

Fix 3: Split the UART harvesting into its own task. The SD write task is now separate from the UART read task. So if SD decides to write for a long period of time, the UART buffer can still be moved into GNSS handling buffer, uninterrupted. The downside to this is an increase in RAM (every new task requires at least 2000 bytes on the stack).

Why not have one big UART buffer? Because we are dolling out bytes from the GNSS buffer to TCP, Bluetooth, and SD. We need a circular buffer with a tail for each of these services. The UART buffer doesn't allow us to have this granularity so we operate the GNSS buffer to pull data from the UART buffer.

()Fix 4: Report overruns and write times to detect problem SD cards. SD write times vary*. While I continue to be dubious of people online who claim certain SD brands are junk, it is true that not all SD cards are equal. Similarly, SD cards can degrade in their performance over their lifetime (more use = more block wear = longer write times, sometimes). If you enable printing of buffer overruns you'll also see debug messages reporting the SD write times, various buffer levels, and buffer overruns. Buffer overruns can now be seen on the logging menu. I don't plan to expose this to the AP Config page at this time. Other than a little overhead during SD writes, there is no downside to these new features.

Fix 5: Increase time between file syncs from 5s to 60s. In order to avoid file write conflicts, we increased the time when the log file's data and 'last accessed' time were written to the file. This means if power is lost or if the system resets up to 60 seconds worth of data would be lost. If the system is powered down normally, everything is recorded as expected.

Fix 6: Make the GNSS handler buffer user variable. By default, the handler buffer is 5120 bytes. Combined with the 5120 byte UART buffer, we get 10,240 bytes or (a byte every 43.4us) 444ms of buffer time before data is lost. If you want, you can make this larger via the debug menu (reset required). Just know that as you increase the buffer you are 'stealing' RAM that other processes may need. If you're only logging, you can very safely increase this to 16k. If you are using BT and WiFi simultaneously you should still be ok to increase to 16k, but future firmware versions may have problems (as we add more features that take more RAM or change the ESP32 core). But know that if your SD card is abysmal, there isn't a buffer big enough. See Fix 4 and turn on overrun statements to know if there are issues with your card.

We've added a test sketch (SD_UART_Tasks) to demonstrate that logging is possible without corruption. Now we're in the process of integrating this approach into the release candidate. I think we're close. We're now seeing perfect ~10 hour logs, but further testing is slowed by the fact it takes hours to really test a given combination of settings.

Current features up in the air until more testing is complete:

image

Above: ~10 hour log (367MB) with no errors.

tonycanike commented 1 year ago

Thanks @nseidle!

I ran a 2 hour test with the 26-Oct RC at 1Hz and logging defaults (7 msgs.) File looked clean according to the UBX integrity checker.

I attempted a test at 4 Hz but ran into an unrelated problem.

Eric-FR commented 1 year ago

Format of the µSD card you provided with official SD card software. Firmware update to 26-Oct RC from the SD card.

First run for 15 min. as prepared (should be NMEA only). No error message.

Change to NMEA+UBX default messages. 15 min. with RTK correction. No error message.

Same as previous incl. RTK correction but with overnight (and day) running. A bunch of errors. The first occurred around 500 kB i.e. corresponding to 1 min. recording. UBX Integrity Checker quickly stopped after 100 resync.

image

Same as previous without RTK correction. Overnight running. Still errors but less:

image

So, with same parameters one may or may not have errors. Sending RTK corrections seems to worth the situation.

pyrog commented 1 year ago

Two long tests with usual SD card (not reformatted). Firmware v2.6-Oct 28 2022 Settings

No errors detected with UBX_Integrity_Checker 😎

With JOSM:

SFE_Surveyor_221029_092539.ubx SFE_Surveyor_221030_104606.ubx
Message SFE_Surveyor_221029_092539.ubx SFE_Surveyor_221030_104606.ubx
File size 19108900 24828676
Longest valid UBX message 0 0
Longest valid NMEA message 117 121
GNTXT 4 7
GNRMC 2551 15419
GNGGA 2551 15419
GNGSA 10204 61676
GNGST 2551 15419
PUBX 21829 28081
GPGSV 30707 39684
GLGSV 29672 38222
GAGSV 28341 41657
GBGSV  22781 29515
GPRMC 19279 12662
GPGGA 19279 12662
GPGSA 77116 50648
GPGST 19278 12662
GPTXT 15 6
nseidle commented 1 year ago

@pyrog - Thank you! Glad to hear it is working. I suspect the single NMEA checksum error is at the start of the log. We currently do a hard start to logging, meaning, whatever is in the buffer, we log. So it's more often than not that a partial message is recorded. We might be able to work around this. I will think on it.

@Eric-FR - Thank you for testing! Long term logging during an RTK fix is not something I had thought a user would need. How are you obtaining the RTCM corrections? BT or WiFi? I assume it is WiFi. I consider long term RTK logging a rare user case, but I will continue to test for it. Please let me know if there are other issues you are seeing.

tonycanike commented 1 year ago

o> Long term logging during an RTK fix is not something I had thought a user would need.

This is not terribly uncommon and it can be very useful. PPK - post processed kinematic - in addition to RTK has some advantages. Surveyors sometimes use a variation of PPK called "Stop and Go" to distinguish it from fully kinematic post processing. (fully kinematic-like the track of a drone.) Stop and Go is PPK used to determine the location of discrete survey points and not, say, the continuous track of a moving vehicle.

The advantages include:

  1. If you are doing a day of RTK work and temporarily move out of range of your base-rover comm link (e.g. the 100mW Holybros aren't powerful enough), you can keep working even though you don't have an RTK fix for, say, an hour of your day. Back in the office you can post process the raw logs from the rover and the base to get PPK solutions for the points when the RTK comm link was down.
  2. Bad fixes with RTK -- 1 or 2 foot busts -- do happen even when you see the magic .014 and the doublebulls eye -- and folks need to be aware of this. The rover gives you no warning whatsoever, though good survey practices of knowing the limits of your equipment and knowing when to take multiple shots on one point while dumping the receiver in between can often detect bad RTK fixes. $20K Trimbles and the like have extensive internal processing to prevent this, reportedly by running multiple internal RTK engines simultaneously with different parameters. Sometimes the algorithm in the ZED just doesn't get it right in challenging conditions (e.g. heavy tree canopy). You can post process the data (PPK) to verify the RTK findings and check for bad fixes.
  3. You can post process using different parameters than the live RTK used. For example, changing the elevation mask or not using problematic SVs (satellites), using differnt constellations, etc., might improve the PPK solution vs. the RTK.

Watch from about 2:10 - 3:22 in this video https://www.youtube.com/watch?v=MLRFHn9tQZM&list=PLf5ROJWC2pbZcF3RUufD645znvgvPFgcv&index=9

The Carlson SurveyGNSS manual describes this ("Processing of Stop and Go Data") on page 40. The inputs are the raw GNSS observations of the Base and the Rover (eg UBX, rinex, etc), and the raw data log (eg .RW5 file) from the data collector.

The raw data log (eg. RW5) file from the survey data collector contains the survey point observations (point name, point description, time, GPS coordinates, etc.) that the surveyor collected. The RW5 is like an automatic digital version of what a surveyor would write down on paper. You know what's in the two rinex files. The post processing uses the point names and times from the RW5 file to pick times to determine PPK location solutions from the two RINEX files. Then you can compare them.

https://www.carlsonsw.com/wordpress/wp-content/uploads/2014/09/Carlson-SurveyGNSS-UserGuide.pdf

Here's a video of someone who doesn't have a professional survey data collector to get a RW5-like file but who has developed a technique to post-process his raw observations as a fully kinematic dataset, and then find the clusters of solutions when he was occupying a mark. He knows he has "stop-n-go" raw observations and he goes and finds where he was "stopped" on a mark to refine his solutions.

https://www.youtube.com/watch?v=wZu1JCZq1Dk&list=LL&index=21

Eric-FR commented 1 year ago

Although my tests were quite long with more than 24 hours, a 12 hours run with car across France or Madagascar is quite common.

But, even for smaller trips, I find corrupted sentences. Last week-end:

Corrections are sent with BT from smartphone, with BT_GNSS in the above example.

BTW, I'm also interested by the Stop&Go idea. Will open an issue.

nseidle commented 1 year ago

As of Dec 22, we have added a UBX/RTCM/NMEA parser to validate incoming messages. This helps reject any message that becomes mal-formed in flight from the ZED-F9x to ESP32. This parser has been particularly good at rejecting bad messages when entering/exiting the config menu when there is a large amount of file access and modification (as settings files get updated at the same time we need to be logging). Preliminary findings across ~60 hours of test logs shows no corruptions in the logs. Additionally, we're able to convert to RINEX and get CSRS reports just fine.

There are two places where you can view rejected message counts:

1) The top of the debug menu 2) Bottom of a log

We've added the following custom NMEA text message to the end of every log:

$GNTXT,01,01,08,1,0,0,*44

Where:

So in the above example, there was one message excluded from the log.

Please give v2.7-Dec 22 or later release candidate a try and let me know.

nseidle commented 1 year ago

We've got ~500+ hours of uncorrupted logs. I'm pretty sure it's fixed in v3.0 firmware. Please open a new issue if anyone has any more problems.