mikalhart / TinyGPSPlus

A new, customizable Arduino NMEA parsing library
http://arduiniana.org
1.1k stars 494 forks source link

TinyGPSPlus accepts sentences with malformed data/field as long as the checksum is valid #87

Open fanfanlatulipe26 opened 3 years ago

fanfanlatulipe26 commented 3 years ago

Hi,

The context: a GPS is connected to an ESP8266, with a softwareserial link, at 57600 bds. Some bits are wrongly transmitted and decoded. It happens only a few times in a long test sequence but the results are a "bit" annoying for my application.

The following sentences are accepted and decoded without error, even if the 1st one is obviously bad:  there is a " in the middle of the latitude and a \<  at the end in front of the date field. Both sentences have the same good checksum.
The second one is correct.

 $GPRMC,042504.900,A,401".1002,N,00440.4650,E,0.00,334.28\<110321,,,A*6D $GPRMC,042504.900,A,4012.1002,N,00440.4650,E,0.00,334.28,110321,,,A*6D

(Some bits are lost by the serial link with the GPS the " code x22 should be 2 code x32, and the \< code x3c should be a , code x2c : only 1 bit is wrong each time)
Here are the results of the "BasicExample" of the library  when adapted to these sentences:

failedChecksum: 0
Location: 4.016667,4.674417  Date/Time: 0/0/2000 04:25:04.90
failedChecksum: 0
Location: 40.201670,4.674417 Date/Time: 3/11/2021 04:25:04.90

This sentence is also accepted  and gives valid null position:
GPGGA,052449.100lt512.0988,N,00540.2444,E,1,5,1.67,236.3,M,48.5,M,,*57\

failedChecksum: 0
Location: 0.000000,0.000000  Date/Time: INVALID 05:24:49.10

Maybe the NMEA parser could also reject a sentence as invalid if it finds wrong character in numeric fields, or missing field delimiter?

Best regards

Forgot to say: library version 1.0.2

TD-er commented 3 years ago

See also my PR on sanitize checks: https://github.com/mikalhart/TinyGPSPlus/pull/77

mikalhart commented 3 years ago

Hi @fanfanlatulipe26. I appreciate the detail you have added to this issue. Your analysis is good and thorough. However, I am not convinced that the best solution is to add the extra baggage to the parser to detect and mask a problem that we agree is extremely rare. Even if we implemented all the rigorous error checks you propose, that still wouldn't completely solve your problem. You might suffer a bit error that changed, for example, a '0' to a '1', and elsewhere a '9' to an '8'. A sentence like this would still have the same checksum, yet report wildly inaccurate numbers that could not be detected or corrected.

I would suggest reducing the baud rate on your device, if that's possible. Better yet, can you use one of the hardware UARTs on the ESP8266? ESP32 gives you three UARTs I think. Is that possible?

TD-er commented 3 years ago

@mikalhart I partly agree with your statement about not adding extra baggage to the code. But on the other hand, my PR, which is essentially a similar issue, is also showing erroneous messages even when using HW serial. (HW serial of the ESP, but also an I2C to UART bridge) For example a . missing in a sentence and still showing valid checksum. (thus showing a latitude way over 90 degree for example) I guess some GPS modules out there have buggy firmware which just outputs the string and computes a checksum while transmitting it. During transmit the internal prepared message may have been updated during transmission or for whatever reason a character may be skipped. But the checksum matches on a faulty sentence.

I never saw characters like a " or '<' in the messages I received though. (those with valid checksum)

fanfanlatulipe26 commented 3 years ago

I agree that it may not be the best place to add extra processing in the parser but it could some times allow to discard some sentences. 
Maybe the problem is with the ESP8266 when WIFI tasks are running in the backgroud and playing with interrupts ??
I will make more test with lower speed, but this morning I also got some errors at 38400bds:
$GPGGA,123013.600,4112.0943,N,00540.6508,E,1,6,1.28,203.7,M,48.5,M,,*58
good checksum ,    latitude wrong, longitude "a bit wrong".  Not easy to catch  !!
In my application I compute  the distance between 2 points and log the points if the distance is higher than xx. I can reject the points if this distance is obviously strange.

The full project is a small transmtiter for RC plane/drone that must transmit with wifi beacon frame its position and other stuff. It is requested in France by the administration since january 1st in some circumtancies. For the fun I add log capabilities, web interface ...
Need to find a trade off between speed of transmission with the GPS / update rate of the NMEA, accuracy, averall load of the ESP8266. Nice hobby.

svdrummer commented 3 years ago

Having spent a lot of time with OEM GPS modules. Smoothing capacitors, or should I say, lack of cap acitors near the GPS module can cause the data issues. I can duplicate this on demand. The TINYGPS library is not the place to filter and slow down. In my opinion, fix the hardware before the software. You could check each CHAR before sending to library, ie pre process.

fanfanlatulipe26 commented 3 years ago

Thanks for the tips. Which kind of capacitors would you add and where ?
I also thought to filter some caracters before calling  gps.encode: the standard states that the most-signficant-bit is always zero. Sentence length is also limited to 82 and I saw a case where 2 sentences got concatenated with a good checksum.
Any how the weak checksum control will never fully protect against the errors and you are right: the best is to get a good hardware link

svdrummer commented 3 years ago

On commercial products, I have .01 blue cap and 420 ish, electro. Both as close to the GPS module as possible. We also disconnected the gps input, as the ESP sends random stuff on startup which the GPS thinks is data and sits in a buffer. Doesnt cause an issue, but some models send a text complaint once a min

TD-er commented 3 years ago

The TINYGPS library is not the place to filter and slow down. In my opinion, fix the hardware before the software. You could check each CHAR before sending to library, ie pre process.

The nice thing about this library is that you can feed it a character at a time and not having to check anything since that requires knowledge of the sentence syntax. So either check it all in the GPS library or accept buggy output. The NMEA knowledge is part of this library. I do agree that you should make the conditions for the GPS receiver as stable as possible, but still it is possible data gets corrupted. Even a simple check for characters that should never occur, like " or any ASCII code > 127, should be part of this library as it is knowledge of the NMEA sentence syntax and thus part of this library. By refusing to process those invalid characters the checksum will be invalid anyway, so it is hardly any extra work to process.

fanfanlatulipe26 commented 3 years ago

Even a perfect hardware link will never prevent input buffer overrun that will always occurs when we are doing useful processing of data and than the sentences could be anything, even with good checksum. I am still convinced that a minimum checks on the input fields syntax will help.

The following set of sentences is accepted and gives Updated/valid data. In fact only the first 2 are correct. "$GPGGA,075846.400,5123.1040,N,00130.2621,E,1,6,1.29,192.8,M,48.5,M,,52\r\n" "$GPRMC,075846.500,A,5123.1040,N,00130.2621,E,0.00,32.12,160321,,,A53\r\n" "$GPGGA,075846.400,X5123.1040,N,00130.2621,E,1,6,1.29,192.8,M,48.5,M,,0A\r\n" // num field with X in front "$GPGGA,075846.400,5123X.1040,N,00130.2621,E,1,6,1.29,192.8,M,48.5,M,,0A\r\n" //num field with X at the end "$GPGGA,07584.400,5123.1040,N,00130.2621,E,1,6,1.29,192.8,M,48.5,M,,64\r\n" // field hour too short "$GPRMC,075846.500,A,5123.1040,N,00130.2621,E,0.00,32.12,16032,,,A62\r\n" // field date too short "$GPRMC,075846.500,A,5123.1040,N,00075846.400,5123.1040,N,00130.2621,E,1,6,1.29,192.8,M,48.5,M,,*7D\r\n"; // at least sentence too long

In the real life in case of buffer overrun we are in trouble only if the new resulting sentences have a same good checksum, but who know ...