Closed tonycanike closed 1 year ago
Results using the 10/1/22 firmware. Looks like issues were early in the file.
C:\Users\canike\Carlson Projects\Jobs\P1014 Facet PPP\2022-10-03 Roost Table>python UBX_Integrity_Checker.py SFE_Facet_221003_131234.ubx UBX Integrity Checker
Could this file contain any NMEA messages? (Y/n): Do you want to repair the file? (y/N):
Processing SFE_Facet_221003_131234.ubx
NMEA $ found at byte 0. Attempting to process the message Panic!! UBX checksum error! Sync lost at byte 7719. Attemting to re-sync. Sync has been lost. Currently processing byte 7719. Rewinding to byte 7677 NMEA $ found at byte 7691. Attempting to process the message Sync successfully re-established at byte 7742. The NMEA message started at byte 7691
Panic!! NMEA checksum error! Sync lost at byte 7893. Attemting to re-sync Sync has been lost. Currently processing byte 7893. Rewinding to byte 7860 NMEA $ found at byte 7864. Attempting to process the message Sync successfully re-established at byte 7895. The NMEA message started at byte 7864
Panic!! NMEA checksum error! Sync lost at byte 12145. Attemting to re-sync Sync has been lost. Currently processing byte 12145. Rewinding to byte 12061 UBX Sync Char 1 (0xB5) found at byte 12109. Checking for Sync Char 2 Sync successfully re-established at byte 12156. The UBX message started at byte 12109
Panic!! UBX checksum error! Sync lost at byte 13448. Attemting to re-sync. Sync has been lost. Currently processing byte 13448. Rewinding to byte 12214 NMEA $ found at byte 12217. Attempting to process the message Panic!! Excessive NMEA message length! Sync lost at byte 12346. Attemting to re-sync NMEA $ found at byte 12417. Attempting to process the message Panic!! NMEA checksum error! Sync lost at byte 12526. Attemting to re-sync NMEA $ found at byte 12697. Attempting to process the message Panic!! NMEA checksum error! Sync lost at byte 12761. Attemting to re-sync UBX Sync Char 1 (0xB5) found at byte 12787. Checking for Sync Char 2 Panic!! Was expecting Sync Char 2 (0x62) but did not receive one! Sync lost at byte 12788. Attemting to re-sync NMEA $ found at byte 12947. Attempting to process the message Sync successfully re-established at byte 12990. The NMEA message started at byte 12947
NMEA $ found at byte 13075. Attempting to process the message Panic!! NMEA checksum error! Sync lost at byte 215552. Attemting to re-sync Sync has been lost. Currently processing byte 215552. Rewinding to byte 215492 NMEA $ found at byte 215636. Attempting to process the message Panic!! Excessive NMEA message length! Sync lost at byte 215765. Attemting to re-sync NMEA $ found at byte 215868. Attempting to process the message Sync successfully re-established at byte 215898. The NMEA message started at byte 215868
Processed 2407768 bytes File size was 2407768 Longest valid UBX message was 1240 bytes Longest valid NMEA message was 83 characters Message types and totals were: Message type: GNTXT Total: 10 Message type: 0x02 0x15 Total: 1556 Message type: GNRMC Total: 1556 Message type: GNGGA Total: 1555 Message type: GNGSA Total: 4687 Message type: GPGSV Total: 2493 Message type: GLGSV Total: 1726 Message type: GAGSV Total: 31 Message type: GBGSV Total: 22 Message type: GQGSV Total: 392 Message type: GNGST Total: 1556 Message type: 0x02 0x13 Total: 9506 Number of successful resyncs: 5
Bye!
C:\Users\canike\Carlson Projects\Jobs\P1014 Facet PPP\2022-10-03 Roost Table>
Today I got what appears to be a 4-hour uncorrupted log file.
I did this by setting the measurement rate to 1 Hz, only collecting GPS and GLONASS data, and not using the WiFi AP config.
I'll paste in the config file and UBX Integrity Checker output below.
UBX Integrity Checker
Could this file contain any NMEA messages? (Y/n): Do you want to repair the file? (y/N):
Processing C:\Users\canike\Carlson Projects\Jobs\P1017 Eckle Static\20221003 Eckle Logging Clearing\SFE_Facet_221003_145712.ubx
NMEA $ found at byte 0. Attempting to process the message
Processed 32382018 bytes File size was 32382018 Longest valid UBX message was 1272 bytes Longest valid NMEA message was 83 characters Message types and totals were: Message type: GNTXT Total: 3 Message type: 0x02 0x15 Total: 16198 Message type: GNRMC Total: 16198 Message type: GNGGA Total: 16198 Message type: GNGSA Total: 48594 Message type: GPGSV Total: 24377 Message type: GLGSV Total: 22532 Message type: GQGSV Total: 4050 Message type: GNGST Total: 16198 Message type: 0x02 0x13 Total: 153078
Bye!
sizeOfSettings=4520 rtkIdentifier=38 rtkFirmwareVersion=v2.6-Oct 1 2022 zedFirmwareVersion=HPG 1.32 printDebugMessages=0 enableSD=1 enableDisplay=1 maxLogTime_minutes=1440 maxLogLength_minutes=1440 observationSeconds=60 observationPositionAccuracy=5.00 fixedBase=0 fixedBaseCoordinateType=0 fixedEcefX=-1280206.568 fixedEcefY=-4716804.403 fixedEcefZ=4086665.484 fixedLat=40.090294790 fixedLong=-105.185057610 fixedAltitude=1560.0890 dataPortBaud=460800 radioPortBaud=57600 surveyInStartingAccuracy=1.0 measurementRate=1000 navigationRate=1 enableI2Cdebug=0 enableHeapReport=0 enableTaskReports=0 dataPortChannel=0 spiFrequency=16 sppRxQueueSize=2048 sppTxQueueSize=512 dynamicModel=0 lastState=0 enableSensorFusion=0 autoIMUmountAlignment=1 enableResetDisplay=0 enableExternalPulse=1 externalPulseTimeBetweenPulse_us=900000 externalPulseLength_us=100000 externalPulsePolarity=1 enableExternalHardwareEventLogging=0 profileName=Profile1 enableNtripServer=0 ntripServer_StartAtSurveyIn=0 ntripServer_CasterHost=rtk2go.com ntripServer_CasterPort=2101 ntripServer_CasterUser= ntripServer_CasterUserPW= ntripServer_MountPoint=bldr_dwntwn2 ntripServer_MountPointPW=WR5wRo4H ntripServer_wifiSSID=TRex ntripServer_wifiPW=parachutes enableNtripClient=0 ntripClient_CasterHost=rtk2go.com ntripClient_CasterPort=2101 ntripClient_CasterUser= ntripClient_CasterUserPW= ntripClient_MountPoint=bldr_SparkFun1 ntripClient_MountPointPW=undefined ntripClient_wifiSSID=TRex ntripClient_wifiPW=parachutes ntripClient_TransmitGGA=1 serialTimeoutGNSS=1 pointPerfectDeviceProfileToken= enablePointPerfectCorrections=1 home_wifiSSID= home_wifiPW= autoKeyRenewal=1 pointPerfectClientID= pointPerfectBrokerHost= pointPerfectLBandTopic= pointPerfectCurrentKey= pointPerfectCurrentKeyDuration=0 pointPerfectCurrentKeyStart=0 pointPerfectNextKey= pointPerfectNextKeyDuration=0 pointPerfectNextKeyStart=0 lastKeyAttempt=0 updateZEDSettings=0 LBandFreq=1556290000 enableLogging=1 timeZoneHours=0 timeZoneMinutes=0 timeZoneSeconds=0 enablePrintWifiIpAddress=1 enablePrintState=0 enablePrintWifiState=0 enablePrintNtripClientState=0 enablePrintNtripServerState=0 enablePrintPosition=0 enableMarksFile=0 enablePrintBatteryMessages=1 enablePrintRoverAccuracy=1 enablePrintBadMessages=0 enablePrintLogFileMessages=0 enablePrintLogFileStatus=1 enablePrintRingBufferOffsets=0 enablePrintNtripServerRtcm=0 enablePrintNtripClientRtcm=0 enablePrintStates=1 enablePrintDuplicateStates=0 radioType=0 espnowPeerCount=0 enableRtcmMessageChecking=0 bluetoothRadioType=0 enableNmeaClient=0 enableNmeaServer=0 espnowBroadcast=1 antennaHeight=0 antennaReferencePoint=0.00 echoUserInput=1 constellation.GPS=1 constellation.SBAS=1 constellation.Galileo=0 constellation.BeiDou=0 constellation.QZSS=1 constellation.GLONASS=1 message.UBX_NMEA_DTM.msgRate=0 message.UBX_NMEA_GBS.msgRate=0 message.UBX_NMEA_GGA.msgRate=1 message.UBX_NMEA_GLL.msgRate=0 message.UBX_NMEA_GNS.msgRate=0 message.UBX_NMEA_GRS.msgRate=0 message.UBX_NMEA_GSA.msgRate=1 message.UBX_NMEA_GST.msgRate=1 message.UBX_NMEA_GSV.msgRate=4 message.UBX_NMEA_RMC.msgRate=1 message.UBX_NMEA_VLW.msgRate=0 message.UBX_NMEA_VTG.msgRate=0 message.UBX_NMEA_ZDA.msgRate=0 message.UBX_NAV_ATT.msgRate=0 message.UBX_NAV_CLOCK.msgRate=0 message.UBX_NAV_DOP.msgRate=0 message.UBX_NAV_EOE.msgRate=0 message.UBX_NAV_GEOFENCE.msgRate=0 message.UBX_NAV_HPPOSECEF.msgRate=0 message.UBX_NAV_HPPOSLLH.msgRate=0 message.UBX_NAV_ODO.msgRate=0 message.UBX_NAV_ORB.msgRate=0 message.UBX_NAV_POSECEF.msgRate=0 message.UBX_NAV_POSLLH.msgRate=0 message.UBX_NAV_PVT.msgRate=0 message.UBX_NAV_RELPOSNED.msgRate=0 message.UBX_NAV_SAT.msgRate=0 message.UBX_NAV_SIG.msgRate=0 message.UBX_NAV_STATUS.msgRate=0 message.UBX_NAV_SVIN.msgRate=0 message.UBX_NAV_TIMEBDS.msgRate=0 message.UBX_NAV_TIMEGAL.msgRate=0 message.UBX_NAV_TIMEGLO.msgRate=0 message.UBX_NAV_TIMEGPS.msgRate=0 message.UBX_NAV_TIMELS.msgRate=0 message.UBX_NAV_TIMEUTC.msgRate=0 message.UBX_NAV_VELECEF.msgRate=0 message.UBX_NAV_VELNED.msgRate=0 message.UBX_RXM_MEASX.msgRate=0 message.UBX_RXM_RAWX.msgRate=1 message.UBX_RXM_RLM.msgRate=0 message.UBX_RXM_RTCM.msgRate=0 message.UBX_RXM_SFRBX.msgRate=1 message.UBX_MON_COMMS.msgRate=0 message.UBX_MON_HW2.msgRate=0 message.UBX_MON_HW3.msgRate=0 message.UBX_MON_HW.msgRate=0 message.UBX_MON_IO.msgRate=0 message.UBX_MON_MSGPP.msgRate=0 message.UBX_MON_RF.msgRate=0 message.UBX_MON_RXBUF.msgRate=0 message.UBX_MON_RXR.msgRate=0 message.UBX_MON_TXBUF.msgRate=0 message.UBX_TIM_TM2.msgRate=0 message.UBX_TIM_TP.msgRate=0 message.UBX_TIM_VRFY.msgRate=0 message.UBX_RTCM_1005.msgRate=0 message.UBX_RTCM_1074.msgRate=0 message.UBX_RTCM_1077.msgRate=0 message.UBX_RTCM_1084.msgRate=0 message.UBX_RTCM_1087.msgRate=0 message.UBX_RTCM_1094.msgRate=0 message.UBX_RTCM_1097.msgRate=0 message.UBX_RTCM_1124.msgRate=0 message.UBX_RTCM_1127.msgRate=0 message.UBX_RTCM_1230.msgRate=0 message.UBX_RTCM_4072_0.msgRate=0 message.UBX_RTCM_4072_1.msgRate=0 message.UBX_ESF_MEAS.msgRate=0 message.UBX_ESF_RAW.msgRate=0 message.UBX_ESF_STATUS.msgRate=0 message.UBX_ESF_ALG.msgRate=0 message.UBX_ESF_INS.msgRate=0
I have the same issue with the RTK Express. And checking back my files, it seems not being so new. v1.11-Mar 14 2022 No problem v1.11-Apr 5 2022 Problem v2.2 Problem v2.5 Problem (although I may have changed acquisition parameters at the same time...).
Attached file faced 78 resync with UBX Integrity Checker:
@tonycanike - As always, thanks for reporting
@Eric-FR - This is very good feedback as it helps us pinpoint, via the changelog, what we broke. The only relevant change I see is a change to serialTimeoutGNSS from 0 to 1ms in v1.12.
There are many variables that we need to test so I welcome help from any reader. This is not the first time this issue has been reported. @pyrog - I know you've been reporting this type of error over the months and releases. We (at SparkFun) have compiled a tremendous amount of testing data and have insights but I worry my preconceived notions will only hinder finding what's causing this issue.
How to run a test:
Tests that need to be done (I welcome others):
I believe if a corruption is going to happen, it happens within ~15 minutes so I recommend running short tests before running an overnight test.
My office test setup is an RTK Express with an SMA extension running to a fixed antenna on the roof. If you have a Facet, my apologies but this will be tedious (at least you'll get outdoors!). If anyone needs an roof antenna or SMA extension, please email me - I'm happy to provide equipment.
I never buy memory cards online but always go to physical well known shops near me, to avoid fake cards.
I will start some testing this week-end.
Panic!! NMEA checksum error!
File | Size (bytes) | Resync |
---|---|---|
SFE_Surveyor_221016_075333.ubx | 2349025 | 3 |
SFE_Surveyor_221016_092833.ubx | 13913100 | 75 |
SFE_Surveyor_221016_112701.ubx | 3532900 | 22 |
SFE_Surveyor_221016_114843.ubx | 12873467 | 73 |
File | Size (bytes) | Resync |
---|---|---|
SFE_Surveyor_221016_164048.ubx | 3945046 | 23 |
I ran a couple tests with the 10/13/ RC
The first test was about 4.5 hours. Receiver running autonomously (no RTCM.) Just turned on the Facet and let it log. Got two resyncs very early on in the UBX file.
The second test was about 5.5 hours. RTK rover getting NTRIP from NY state DOT RTN servers via SurvPC (which was configured to treat the Facet like a UBLOX device, not simply a generic NMEA receiver.) Usually fixed solution but sometimes had to restart the NTRIP feed.
Got five resyncs and a number of messages well throughout the file. Perhaps receiving the NTRIP feed consumed extra resources.
Start of log file: $GNTXT,01,01,00,ESP_RST_POWERON1C $GNTXT,01,01,03,v2.6-Oct 1 202268 $GNTXT,01,01,04,HPG 1.32*34
Analysis output:
C:....>python UBX_Integrity_Checker.py SFE_Facet_221015_172738.ubx UBX Integrity Checker
Could this file contain any NMEA messages? (Y/n): Do you want to repair the file? (y/N):
Processing SFE_Facet_221015_172738.ubx
NMEA $ found at byte 0. Attempting to process the message Panic!! Excessive NMEA message length! Sync lost at byte 43107. Attemting to re-sync Sync has been lost. Currently processing byte 43108. Rewinding to byte 42979 NMEA $ found at byte 43415. Attempting to process the message Sync successfully re-established at byte 43445. The NMEA message started at byte 43415
Panic!! UBX checksum error! Sync lost at byte 96257. Attemting to re-sync. Sync has been lost. Currently processing byte 96257. Rewinding to byte 96215 NMEA $ found at byte 96310. Attempting to process the message Sync successfully re-established at byte 96340. The NMEA message started at byte 96310
Processed 52667387 bytes File size was 52667387 Longest valid UBX message was 1944 bytes Longest valid NMEA message was 83 characters Message types and totals were: Message type: GNTXT Total: 23 Message type: 0x02 0x15 Total: 18584 Message type: GNRMC Total: 18584 Message type: GNGGA Total: 18584 Message type: GNGSA Total: 92850 Message type: GNGST Total: 18583 Message type: 0x02 0x13 Total: 257000 Message type: GPGSV Total: 27021 Message type: GLGSV Total: 22674 Message type: GQGSV Total: 4645 Message type: GAGSV Total: 25247 Message type: GBGSV Total: 17377 Number of successful resyncs: 2
Bye!
Start of log file: $GNTXT,01,01,00,ESP_RST_POWERON1C $GNTXT,01,01,03,v2.6-Oct 13 20227B $GNTXT,01,01,04,HPG 1.3234 $GNTXT,01,01,07,4DAE22
Analysis output:
C:....>python UBX_Integrity_Checker.py SFE_Facet_221017_125247.ubx UBX Integrity Checker
Could this file contain any NMEA messages? (Y/n): Do you want to repair the file? (y/N):
Processing SFE_Facet_221017_125247.ubx
NMEA $ found at byte 0. Attempting to process the message NMEA $ found at byte 642936. Attempting to process the message NMEA $ found at byte 652389. Attempting to process the message Panic!! Excessive NMEA message length! Sync lost at byte 652518. Attemting to re-sync Sync has been lost. Currently processing byte 652519. Rewinding to byte 652390 NMEA $ found at byte 652485. Attempting to process the message Panic!! NMEA checksum error! Sync lost at byte 652557. Attemting to re-sync UBX Sync Char 1 (0xB5) found at byte 652652. Checking for Sync Char 2 Panic!! Was expecting Sync Char 2 (0x62) but did not receive one! Sync lost at byte 652653. Attemting to re-sync NMEA $ found at byte 652832. Attempting to process the message Panic!! Excessive NMEA message length! Sync lost at byte 652961. Attemting to re-sync NMEA $ found at byte 653056. Attempting to process the message Panic!! Excessive NMEA message length! Sync lost at byte 653185. Attemting to re-sync UBX Sync Char 1 (0xB5) found at byte 653389. Checking for Sync Char 2 Sync successfully re-established at byte 653398. The UBX message started at byte 653389
UBX Sync Char 1 (0xB5) found at byte 664432. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 667515. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 680250. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 799941. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 813004. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 847028. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 854353. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 932348. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 937840. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 949298. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 1024086. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 1196374. Checking for Sync Char 2 NMEA $ found at byte 1201139. Attempting to process the message Panic!! NMEA checksum error! Sync lost at byte 1201201. Attemting to re-sync Sync has been lost. Currently processing byte 1201201. Rewinding to byte 1201140 NMEA $ found at byte 1201211. Attempting to process the message Panic!! Excessive NMEA message length! Sync lost at byte 1201340. Attemting to re-sync UBX Sync Char 1 (0xB5) found at byte 1201523. Checking for Sync Char 2 Sync successfully re-established at byte 1201548. The UBX message started at byte 1201523
UBX Sync Char 1 (0xB5) found at byte 1217017. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 1234149. Checking for Sync Char 2 Panic!! UBX checksum error! Sync lost at byte 1275922. Attemting to re-sync. Sync has been lost. Currently processing byte 1275922. Rewinding to byte 1275432 NMEA $ found at byte 1275623. Attempting to process the message Panic!! Excessive NMEA message length! Sync lost at byte 1275752. Attemting to re-sync UBX Sync Char 1 (0xB5) found at byte 1275772. Checking for Sync Char 2 Sync successfully re-established at byte 1275787. The UBX message started at byte 1275772
UBX Sync Char 1 (0xB5) found at byte 1294479. Checking for Sync Char 2 NMEA $ found at byte 1336852. Attempting to process the message Panic!! NMEA checksum error! Sync lost at byte 1336905. Attemting to re-sync Sync has been lost. Currently processing byte 1336905. Rewinding to byte 1336853 UBX Sync Char 1 (0xB5) found at byte 1336853. Checking for Sync Char 2 Sync successfully re-established at byte 1336868. The UBX message started at byte 1336853
UBX Sync Char 1 (0xB5) found at byte 1358448. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 1482487. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 1576053. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 2155790. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 2169259. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 2174035. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 2856681. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 2927084. Checking for Sync Char 2 Panic!! UBX checksum error! Sync lost at byte 2998399. Attemting to re-sync. Sync has been lost. Currently processing byte 2998399. Rewinding to byte 2998373 UBX Sync Char 1 (0xB5) found at byte 2998395. Checking for Sync Char 2 Sync successfully re-established at byte 2998426. The UBX message started at byte 2998395
UBX Sync Char 1 (0xB5) found at byte 3000616. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 3276548. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 3277047. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 3281735. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 3283207. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 3283691. Checking for Sync Char 2 UBX Sync Char 1 (0xB5) found at byte 3285672. Checking for Sync Char 2
Processed 15854564 bytes File size was 15854564 Longest valid UBX message was 1400 bytes Longest valid NMEA message was 79 characters Message types and totals were: Message type: GNTXT Total: 5 Message type: 0x02 0x13 Total: 328092 Message type: 0x02 0x15 Total: 261 Message type: GNRMC Total: 261 Message type: GNGGA Total: 261 Message type: GNGSA Total: 1305 Message type: GPGSV Total: 431 Message type: GLGSV Total: 305 Message type: GAGSV Total: 319 Message type: GBGSV Total: 149 Message type: GQGSV Total: 66 Message type: GNGST Total: 262 Message type: 0x06 0x13 Total: 13 Message type: 0x05 0x01 Total: 679 Message type: 0x0A 0x06 Total: 2 Message type: 0x05 0x00 Total: 30 Message type: 0x06 0x8B Total: 64 Message type: 0x06 0x00 Total: 16 Message type: 0x0A 0x04 Total: 2 Message type: 0x06 0x71 Total: 2 Message type: 0x01 0x04 Total: 2241 Message type: 0x01 0x07 Total: 2235 Message type: 0x01 0x14 Total: 2210 Message type: 0x01 0x35 Total: 1119 Message type: 0x02 0x32 Total: 562 Number of successful resyncs: 5
Bye!
Summary: v2.6 has errors with 8 messages enabled
Yesterday, I made a mountain bike trip of 4 hours with the RTK Express and last firmware update :
$GNTXT,01,01,03,v2.6-Oct 13 2022*7B
I got a lot of errors:
Processing SFE_Express_221016_102210.ubx
NMEA $ found at byte 0. Attempting to process the message
Panic!! UBX checksum error!
Sync lost at byte 185544. Attemting to re-sync.
Sync has been lost. Currently processing byte 185544. Rewinding to byte 184910
UBX Sync Char 1 (0xB5) found at byte 185103. Checking for Sync Char 2
Panic!! Was expecting Sync Char 2 (0x62) but did not receive one!
Sync lost at byte 185104. Attemting to re-sync
NMEA $ found at byte 185220. Attempting to process the message
Panic!! NMEA checksum error!
Sync lost at byte 185319. Attemting to re-sync
UBX Sync Char 1 (0xB5) found at byte 185331. Checking for Sync Char 2
Panic!! Was expecting Sync Char 2 (0x62) but did not receive one!
Sync lost at byte 185332. Attemting to re-sync
UBX Sync Char 1 (0xB5) found at byte 185504. Checking for Sync Char 2
Sync successfully re-established at byte 185559. The UBX message started at byte 185504
[...]
Panic! Maximum rewind attempts reached! Aborting...
Processed 3066688 bytes
File size was 69441916
FILE SIZE MISMATCH!!
Longest valid UBX message was 1816 bytes
Longest valid NMEA message was 86 characters
Message types and totals were:
Message type: GNTXT Total: 4
Message type: 0x02 0x15 Total: 622
Message type: 0x01 0x07 Total: 652
Message type: 0x01 0x02 Total: 654
Message type: GNRMC Total: 654
Message type: GNGGA Total: 653
Message type: GNGSA Total: 3269
Message type: GPGSV Total: 4367
Message type: GLGSV Total: 3207
Message type: GAGSV Total: 3197
Message type: GBGSV Total: 3412
Message type: GQGSV Total: 656
Message type: GNGST Total: 654
Message type: 0x01 0x35 Total: 640
Message type: 0x02 0x13 Total: 8216
Number of successful resyncs: 101
Bye!
Settings and full report attached:
Thanks all! There is no need to copy paste the output from the UBX checker; all we need to know is what test you attempted (ex: v1.11 firmware with 7 messages enabled shows errors) and a summary of the integrity checker (success or number or resyncs) will suffice.
From my own notes:
Currently I don't see a firmware that doesn't have errors. v1.11 produces errors similar to v2.6. However, reducing to 1Hz allows error free logging (a few others have seen this as well). But I have a few more ideas to try to get fewer (hopefully zero) errors.
Note: This issue is related to log corruption, regardless of firmware version and message format.
Some examples of corrupted NMEA sentences (regex search ^(?!\$G)
):
4GPRMC,115344.50,A,4558.4524745,N,00621.0063101,E,1.992,257.45,161022,,,R,V*0D
29,23,255,46,32,48,087,50,1*7D
GBGSV,2,2,07,07,02,075,,10,01,087,,25,05,031,,0*48
,12,0C,121022.50,A,4558.2663623,N,00620.5156245,E,1.555,39.83,161022,,,R,V*39
.65,MC,121237.75,A,4558.2147219,N,00620.5075217,E,0.084,,161022,,,R,V*14
4,12MC,122127.75,A,4558.0947462,N,00620.4406795,E,1.582,200.87,161022,,,R,V*0A
49,PGSV,2,2,05,26,00,021,,0*57
6
4,12MC,125658.75,A,4557.9857028,N,00619.9455904,E,2.110,337.59,161022,,,R,V*03
Note: This issue is related to log corruption, regardless of firmware version and message format.
Agreed, I updated the issue title.
@pyrog - Thanks! You're asking a nerd about his code, I'd love to share! :) I appreciate your thoughts. Have a look at the older version of the code (as it is a lot simpler to follow):
//Attempt to write to file system. This avoids collisions with file writing from other functions like recordSystemSettingsToFile()
if (xSemaphoreTake(sdCardSemaphore, fatSemaphore_shortWait_ms) == pdPASS)
{
ubxFile.write(rBuffer, s);
xSemaphoreGive(sdCardSemaphore);
} //End sdCardSemaphore
else
{
log_d("sdCardSemaphore failed to yield, %s line %d\r\n", __FILE__, __LINE__);
}
Any access to the SD card is through a semaphore meaning only one thing can talk to the SD card at a time, thus I believe all writes are atomic. The ESP32 is dual core and runs FreeRTOS (real time operating system). In short, this means various tasks can run simultaneously and we don't know ahead if time (really) what task will be accessing which piece of hardware. The semaphore prevents other tasks from having the SPI/SD card open at the same time we trying to record log data.
The RTOS comes to a screeching halt when multiple things try to access the same hardware so we are confident the semaphore is working as expected.
The current version of the code that writes to the SD card is more complex because it implements a circular buffer to catch all incoming data from the ZED-F9P incase the semaphore cannot be unlocked in the given amount of time.
@nseidle (just after posting my previous message) I saw only one mutex, for gaining access to the SD card 😄 When reading several articles about data race, I understood that the issue could also be located at the input buffer. The buffer containing the F9P message could be overwrite (by another task or an interruption) when the buffer is currently parsed.
So even with a Semaphore on the SD Card, this "corrupted" buffer could be "safely" written in the log file. I don't dig into your code, but I don't see another semaphore to protect the input buffer. Even with a real time OS and a single core processor, data race could happened.
But maybe the issue in the new code is that the circular buffer is too small (or the code for logging is too slow) ?
Does accessing i.e. online.logging
is atomic ?
What happened if wifiNmeaData()
or bluetoothWriteBytes()
are too long ?
ButtonCheckTask as the same priority than F9PSerialReadTask or F9PSerialWriteTask (1. 3 being the highest, and 0 being the lowest) Does F9PSerialReadTask could have a higher priority than ButtonCheckTask?
RTK Express $GNTXT,01,01,03,v2.6-Oct 13 20227B $GNTXT,01,01,04,HPG 1.3234 $GNTXT,01,01,07,181E*2B
New µSD card. Factory reset.
5 default NMEA messages (4 Hz)
30 mn recording with RTK correction through BT: more than 100 rewind. 15 mn recording without RTK: integrity OK. Again 15 mn recording with RTK/BT: more than 100 rewind. 15 mn recording without RTK: integrity OK.
New test this morning suggested by pyrog. Same as previous. BT connexion with SW_Maps to get the position on the smartphone but without sending RTK corrections to the Express. 15 mn recording: integrity OK.
Internal NTRIP client on. Wifi AP connected but smartphone disconnected from internet: the device can't receive RTCM.
File | Size (bytes) | Resync |
---|---|---|
SFE_Surveyor_221022_161511.ubx | 2504304 | 1 |
The SWAIR project NMEA Analyser report:
2 invalid lines
1 bad header and termination
1 invalid checksums
Bad header and termination: 44,44,10,29,053,46,3*76
Invalid Checksum: $GLGSV,2,2,06,79,38,041,49,88,09,028,29*4C
Internal NTRIP client on. Wifi AP connected and smartphone connected to internet: the device receive RTCM.
File | Size (bytes) | Resync |
---|---|---|
SFE_Surveyor_221024_091005.ubx | 3413937 | 18 |
The SWAIR project NMEA Analyser report:
20 invalid lines
2 bad header and termination
18 invalid checksums
Bad header and termination: .65,584091140.25,A,4623.5987283,N,00547.5779868,E,0.007,,241022,,,F,V*02
Invalid Checksum: $GNGSA,A,1,,,,,,,,,,,,,99.9,99.99,99.99,4*36
Invalid Checksum: $GPGGA,091140.25,4623.5987283,N,00547.5779868,E,5,12,0.65,584915,M,47.230,M,2.3,0002*7C
Invalid Checksum: $GPGSV,1,1,02,0739,,177,,11,06,314,,0*60
Invalid Checksum: $GLGSV,2,2,08,78,12,098,49,83,39,0,41,84,63,295,29,85,09,335,24,1*76
Invalid Checksum: $GPGSV,2,1,07,01,24,145,42,03,48,079,48,04,81,109,47,06,39,3SV,34,6*6C
Invalid Checksum: $GLGSV2,1,08,67,11,056,40,68,57,031,40,69,57,271,38,77,17,043,23,1*7D
Bad header and termination: .54,MC,091755.75,A,4623.5988755,N,00547.5784830,E,0.017,,241022,,,F,V*0A
Invalid Checksum: $GPGGA,091755.75,4623.5988755,N,00547.5784830,E,5,12,0.54,86.077,M,47.230,M,0.8,0002*71
Invalid Checksum: $GLGSV,2,1,08,67,09,057,39,68,55,033,31,69,58,275,38,77,17,041,27,3*73
Invalid Checksum: $GLGSV,1,1,02,70,02,248,,76,02,35,0*7D
Invalid Checksum: $GPGSA,A,3,78,77769,84,83,85,68,,,,,,1.06,0.54,0.91,2*16
Invalid Checksum: $GPGSA,A,3,28,06,14,24,25,09,2,,,,,1.06,0.54,0.91,4*10
Invalid Checksum: $GPGSA,A,3,78,777,,,,,,83,85,68,,,,,,1.06,0.54,0.91,2*16
Invalid Checksum: $GPGSA,A,3,28,06,14,24,25,09,27,,,,,.06,0.54,0.91,4*10
Invalid Checksum: $GPGGA,092335.75,4623.5993197,N,00547.5772854,3,85,680.54,585.495,M,47.230,M,1.8,0002*73
Invalid Checksum: $GPGSA,A,3,78,77,69,,,,1.,85,68,,,,,,1.14,0.60,0.96,2*15
Invalid Checksum: $GPGSA,A,3,28,06,14,24,25,09,27,,,,,,1.,0.60,0.96,4*13
Invalid Checksum: $GAGSV,2,1,06,47,21,054,43,10,27,315,26,12,51,303,44,19,57,229,36,2*77
Invalid Checksum: $GAGSV,1,1,04,07,21,054,4,12,51,303,35,26,17,152,47,33,63,181,37,7*7B
Note: UBX_Integrity_Checker don't seem to detect Invalid NMEA Checksum.
Duration 5 mn.
SFE_Surveyor_221024_120758.nmea The SWAIR project NMEA Analyser report:
1 invalid lines
1 bad header and termination
0 invalid checksums
0 corrupted messages
16988 detected messages
601 GGA
2404 GSA
12181 GSV
601 RMC
600 GLL
0 ZDA
601 VTG
0 Not supported
16989 TOTAL LINES in log
The corrupted line is the last line (the device was powered off).
UBX_Integrity_Checker can't check the file.
Panic!! NMEA CR not found!
Sync lost at byte 1023714. Attemting to re-sync
NMEA $ found at byte 1023715. Attempting to process the message
Processed 1023764 bytes
File size was 1023764
Longest valid UBX message was 0 bytes
Longest valid NMEA message was 0 characters
SFE_Surveyor_221024_120758.ubx The SWAIR project NMEA Analyser report:
0 invalid lines
0 bad header and termination
0 invalid checksums
0 corrupted messages
14889 detected messages
1228 GGA
4912 GSA
6289 GSV
1228 RMC
0 GLL
0 ZDA
0 VTG
1232 Not supported
14890 TOTAL LINES in log
UBX_Integrity_Checker
Processed 936789 bytes
File size was 936789
Longest valid UBX message was 0 bytes
Longest valid NMEA message was 85 characters
Message types and totals were:
Message type: GNTXT Total: 4
Message type: GNRMC Total: 13
Message type: GNGGA Total: 13
Message type: GNGSA Total: 52
Message type: GNGST Total: 13
Message type: GPGSV Total: 1988
Message type: GLGSV Total: 1535
Message type: GAGSV Total: 1538
Message type: GBGSV Total: 1228
Message type: GPRMC Total: 1215
Message type: GPGGA Total: 1215
Message type: GPGSA Total: 4860
Message type: GPGST Total: 1215
When the Surveyor compute and log GNGST messages, following messages aren't logged to the SD Card: GNGLL GPGSV GLGSV GAGSV GBGSV GNVTG
Note: This is not systematic
Just an update: We are working a lot on this issue. There's a lot learned and discovered but we don't have a flawless solution yet.
@pyrog - Thank you for the comparisons!
Note: UBX_Integrity_Checker don't seem to detect Invalid NMEA Checksum.
I believe it does.
following messages aren't logged to the SD Card
Good! You are doing a lot of what we are: comparing the ZED's USB output to the SD card. They should be identical, but only if you have configured the USB interface on the ZED properly. This is not trivial. You can use u-center but you need to modify a lot of options. Below is an example: the GSV message is decimated to once per second on the UART interface, but is 4Hz on the USB interface.
This is complicated enough we wrote a helper function in the debug menu. 17 will duplicated the UART settings to the USB interface:
Only after this has been set should you compare USB output from the ZED to the SD logs. There's still discrepancies but you will see far fewer:
Note: UBX_Integrity_Checker don't seem to detect Invalid NMEA Checksum.
Oops! it doesn't detect "bad header and termination" (see test with SFE_Surveyor_221024_091005.ubx) This is probably normal because this tool is used to parsed primarily UBX, not NMEA.
They should be identical, but only if you have configured the USB interface on the ZED properly.
You're right. I thought that the device configured automatically both F9P outputs. At least for me, one interest of Sparkfun RTK device is to avoid to use u-Center for setting the F9P.
This is not trivial.
👍
You can use u-center but you need to modify a lot of options. Below is an example: the GSV message is decimated to once per second on the UART interface, but is 4Hz on the USB interface.
Manual settings with U-Center are useful for fine tuning the f9p, but with your product I thing we shouldn't use it.
Anyway, the Sparkfun device should log to the SD card exactly what it receive from the f9p.
This is complicated enough we wrote a helper function in the debug menu. 17 will duplicated the UART settings to the USB interface: Only after this has been set should you compare USB output from the ZED to the SD logs
Okay, I will use this option in next tests 😃
tldr; Please load the new Oct26 RC, format your SD card using SD Formatter, and try logging at 4Hz with 7 messages on (or whatever test you'd like to run). It should work much better. Please report back any log corruptions or issues.
For those who are interested, turn on buffer overrun printing: main menu-> 's' -> 'd' -> 33
Longer story:
The ESP32 uses a 127 byte UART FIFO. As this buffer crosses 120 bytes, an interrupt occurs and the FIFO contents are transferred to the 'UART receive' buffer. This 'UART receive' buffer was previously defined to be 4096 bytes. We used a RTOS task called F9PSerialReadTask() that would read in data from the UART and then write it to the SD card. The root cause of the log corruption was because the SD write times vary based on all sorts of things(*) out of our control, but in general, vary from 15ms to ~250ms (or more). The baud rate of UART2 was 460,800bps or 46,080 bytes per second or a byte every 21.7 microseconds. If the SD blocks for 250ms that's 11,520 bytes coming in. The UART buffer of 4096 bytes would quickly overflow and data was lost. Note, most SD writes are 5-10ms (~460bytes) so we didn't see corruption often, only when the SD card needed a minute to breathe. Additionally, the FIFO buffer is limited to 127 bytes as was by default being serviced only after 120 bytes was received. This worked fine until other interrupts needed servicing and the FIFO would overrun (see fix 2).
Users who are logging at a 1Hz fix rate (1/4 of 11520 = 2880 bytes per second) or with only NMEA sentences enabled (100s of bytes per second) would rarely, if ever, see a log corruption because of the lower data rates begin used. Here is a list of the changes we've made:
Fix 1: Reduce baud rate to 230,400bps. This slows incoming data from the ZED into the ESP32 so now we only have to deal with a max of 5760 bytes coming in during a 250ms block. Note: The downside to this is that higher fix rates (ie 10Hz) will need to operate at a higher baud rate, and thus are more likely to run into logging errors. I don't know of any users operating at >4Hz but we can cross that bridge when it's requested.
Fix 2: Reduce the FIFO threshold. The default is 120 out of 127 bytes. This means the UART ISR must be attended to within ~300us before data is lost. Decreasing this threshold to 50 bytes allows other, higher priority interrupts to occur, then the UART ISR before the FIFO overruns.
Fix 3: Split the UART harvesting into its own task. The SD write task is now separate from the UART read task. So if SD decides to write for a long period of time, the UART buffer can still be moved into GNSS handling buffer, uninterrupted. The downside to this is an increase in RAM (every new task requires at least 2000 bytes on the stack).
Why not have one big UART buffer? Because we are dolling out bytes from the GNSS buffer to TCP, Bluetooth, and SD. We need a circular buffer with a tail for each of these services. The UART buffer doesn't allow us to have this granularity so we operate the GNSS buffer to pull data from the UART buffer.
()Fix 4: Report overruns and write times to detect problem SD cards. SD write times vary*. While I continue to be dubious of people online who claim certain SD brands are junk, it is true that not all SD cards are equal. Similarly, SD cards can degrade in their performance over their lifetime (more use = more block wear = longer write times, sometimes). If you enable printing of buffer overruns you'll also see debug messages reporting the SD write times, various buffer levels, and buffer overruns. Buffer overruns can now be seen on the logging menu. I don't plan to expose this to the AP Config page at this time. Other than a little overhead during SD writes, there is no downside to these new features.
Fix 5: Increase time between file syncs from 5s to 60s. In order to avoid file write conflicts, we increased the time when the log file's data and 'last accessed' time were written to the file. This means if power is lost or if the system resets up to 60 seconds worth of data would be lost. If the system is powered down normally, everything is recorded as expected.
Fix 6: Make the GNSS handler buffer user variable. By default, the handler buffer is 5120 bytes. Combined with the 5120 byte UART buffer, we get 10,240 bytes or (a byte every 43.4us) 444ms of buffer time before data is lost. If you want, you can make this larger via the debug menu (reset required). Just know that as you increase the buffer you are 'stealing' RAM that other processes may need. If you're only logging, you can very safely increase this to 16k. If you are using BT and WiFi simultaneously you should still be ok to increase to 16k, but future firmware versions may have problems (as we add more features that take more RAM or change the ESP32 core). But know that if your SD card is abysmal, there isn't a buffer big enough. See Fix 4 and turn on overrun statements to know if there are issues with your card.
We've added a test sketch (SD_UART_Tasks) to demonstrate that logging is possible without corruption. Now we're in the process of integrating this approach into the release candidate. I think we're close. We're now seeing perfect ~10 hour logs, but further testing is slowed by the fact it takes hours to really test a given combination of settings.
Current features up in the air until more testing is complete:
Above: ~10 hour log (367MB) with no errors.
Thanks @nseidle!
I ran a 2 hour test with the 26-Oct RC at 1Hz and logging defaults (7 msgs.) File looked clean according to the UBX integrity checker.
I attempted a test at 4 Hz but ran into an unrelated problem.
Format of the µSD card you provided with official SD card software. Firmware update to 26-Oct RC from the SD card.
First run for 15 min. as prepared (should be NMEA only). No error message.
Change to NMEA+UBX default messages. 15 min. with RTK correction. No error message.
Same as previous incl. RTK correction but with overnight (and day) running. A bunch of errors. The first occurred around 500 kB i.e. corresponding to 1 min. recording. UBX Integrity Checker quickly stopped after 100 resync.
Same as previous without RTK correction. Overnight running. Still errors but less:
So, with same parameters one may or may not have errors. Sending RTK corrections seems to worth the situation.
Two long tests with usual SD card (not reformatted). Firmware v2.6-Oct 28 2022 Settings
No errors detected with UBX_Integrity_Checker 😎
With JOSM:
Message | SFE_Surveyor_221029_092539.ubx | SFE_Surveyor_221030_104606.ubx |
---|---|---|
File size | 19108900 | 24828676 |
Longest valid UBX message | 0 | 0 |
Longest valid NMEA message | 117 | 121 |
GNTXT | 4 | 7 |
GNRMC | 2551 | 15419 |
GNGGA | 2551 | 15419 |
GNGSA | 10204 | 61676 |
GNGST | 2551 | 15419 |
PUBX | 21829 | 28081 |
GPGSV | 30707 | 39684 |
GLGSV | 29672 | 38222 |
GAGSV | 28341 | 41657 |
GBGSV | 22781 | 29515 |
GPRMC | 19279 | 12662 |
GPGGA | 19279 | 12662 |
GPGSA | 77116 | 50648 |
GPGST | 19278 | 12662 |
GPTXT | 15 | 6 |
@pyrog - Thank you! Glad to hear it is working. I suspect the single NMEA checksum error is at the start of the log. We currently do a hard start to logging, meaning, whatever is in the buffer, we log. So it's more often than not that a partial message is recorded. We might be able to work around this. I will think on it.
@Eric-FR - Thank you for testing! Long term logging during an RTK fix is not something I had thought a user would need. How are you obtaining the RTCM corrections? BT or WiFi? I assume it is WiFi. I consider long term RTK logging a rare user case, but I will continue to test for it. Please let me know if there are other issues you are seeing.
o> Long term logging during an RTK fix is not something I had thought a user would need.
This is not terribly uncommon and it can be very useful. PPK - post processed kinematic - in addition to RTK has some advantages. Surveyors sometimes use a variation of PPK called "Stop and Go" to distinguish it from fully kinematic post processing. (fully kinematic-like the track of a drone.) Stop and Go is PPK used to determine the location of discrete survey points and not, say, the continuous track of a moving vehicle.
The advantages include:
Watch from about 2:10 - 3:22 in this video https://www.youtube.com/watch?v=MLRFHn9tQZM&list=PLf5ROJWC2pbZcF3RUufD645znvgvPFgcv&index=9
The Carlson SurveyGNSS manual describes this ("Processing of Stop and Go Data") on page 40. The inputs are the raw GNSS observations of the Base and the Rover (eg UBX, rinex, etc), and the raw data log (eg .RW5 file) from the data collector.
The raw data log (eg. RW5) file from the survey data collector contains the survey point observations (point name, point description, time, GPS coordinates, etc.) that the surveyor collected. The RW5 is like an automatic digital version of what a surveyor would write down on paper. You know what's in the two rinex files. The post processing uses the point names and times from the RW5 file to pick times to determine PPK location solutions from the two RINEX files. Then you can compare them.
https://www.carlsonsw.com/wordpress/wp-content/uploads/2014/09/Carlson-SurveyGNSS-UserGuide.pdf
Here's a video of someone who doesn't have a professional survey data collector to get a RW5-like file but who has developed a technique to post-process his raw observations as a fully kinematic dataset, and then find the clusters of solutions when he was occupying a mark. He knows he has "stop-n-go" raw observations and he goes and finds where he was "stopped" on a mark to refine his solutions.
https://www.youtube.com/watch?v=wZu1JCZq1Dk&list=LL&index=21
Although my tests were quite long with more than 24 hours, a 12 hours run with car across France or Madagascar is quite common.
But, even for smaller trips, I find corrupted sentences. Last week-end:
Corrections are sent with BT from smartphone, with BT_GNSS in the above example.
BTW, I'm also interested by the Stop&Go idea. Will open an issue.
As of Dec 22, we have added a UBX/RTCM/NMEA parser to validate incoming messages. This helps reject any message that becomes mal-formed in flight from the ZED-F9x to ESP32. This parser has been particularly good at rejecting bad messages when entering/exiting the config menu when there is a large amount of file access and modification (as settings files get updated at the same time we need to be logging). Preliminary findings across ~60 hours of test logs shows no corruptions in the logs. Additionally, we're able to convert to RINEX and get CSRS reports just fine.
There are two places where you can view rejected message counts:
1) The top of the debug menu 2) Bottom of a log
We've added the following custom NMEA text message to the end of every log:
$GNTXT,01,01,08,1,0,0,*44
Where:
So in the above example, there was one message excluded from the log.
Please give v2.7-Dec 22 or later release candidate a try and let me know.
We've got ~500+ hours of uncorrupted logs. I'm pretty sure it's fixed in v3.0 firmware. Please open a new issue if anyone has any more problems.
Subject of the issue
I'm seeing some log file corruption again. Four hour files gathered with V2.5 firmware, all constellations, 1 Hz.
Most of the corruption was early in the log file, probably during the time I was using WiFi AP to configure the Facet, but UBX_Integrity_Checker.py reported a few resyncs much later in the log file.
I did a quick check with the 10/1/22 RC firmware, see comment for details.
Your workbench
V2.5 Facets
Steps to reproduce
Tell us how to reproduce this issue. Please post any log files from serial output that may have been generated.
Expected behavior
Tell us what should happen
Actual behavior
Tell us what happens instead