Suggestions from testing code with different GPS units

rejkid commented 5 years ago

1) There are some GPS units that return just '\r' at the end of sentence - but it is still valid sentence. I added to minmea_check one more check for the end of line char as '\r'. 2) In minmea_sentence_id I have swapped 'minmea_scan' with 'minmea_check' so the minmea_scan detects the wrong caller id before wrong checksum is reported.

kosma commented 5 years ago

Unfortunately, this won't fly as-is.

Checksum is a part of framing (lower level than caller ID), therefore needs to be checked first. If the checksum is incorrect, we can't make any assumptions regarding the contents of the frame (i.e. the caller ID itself might be corrupted).
The changed code doesn't build - see CI errors: https://travis-ci.org/kosma/minmea/jobs/438449081 My proposed fix is to use just strcmp(), without any extra casts and strncmp. By the way, using strncmp means a regression is introduced - this would happily accept $GPGGA\rSOME_JUNK as a valid sentence (even though it's not).
An added functionality requires adding corresponding tests in tests.c so we can be sure the fix was correct and doesn't introduce regressions.

rejkid commented 5 years ago

Hi Kosma, 1) OK. Fair enough. I accept point 1. 2) The sentence “$GPGGA\rSOME_JUNK” must contain checksum, so it would look like: “$GPGGA,*76\rSOME_JUNK”. If the checksum is OK then the sentence can’t be corrupted. It the text just after delimiter that constitutes the garbage – isn’t it? So by using strcmp we will reject perfectly valid sentence. Kind Regards, Janusz

kosma commented 5 years ago

Note, however, the following input:

$GPGGA,*76\r$GPRMC,*42

This obviously should return an error because we'd silently ignore the second sentence. Generally, any junk past the end of the sentence means the caller has some serious bugs. If the input data is malformed, it's the caller's responsibility to clean it up (and also accept the results may be suboptimal or flat out wrong). minmea is a parser for generally correct, real-life NMEA (we do our best to support what the receivers produce), not for cleaning up the results of someone's grave coding errors.

Also:

If the checksum is OK then the sentence can’t be corrupted.

Not true, and easy to encounter. If UART is dropping characters (a common thing in badly written drivers, especially in embedded systems), it's enough to drop two of the same character (comma, zero, or any other) and the checksum will still match but the frame will be incorrect. It's a common problem with xor-based checksums.

rejkid commented 5 years ago

Hi Kosma, 1) I accept the second point about “If the checksum is OK…” – Yep. OK. 2) I am glad we have this conversation because I want to implement my parser correctly. The way I implemented my parser is in two layers: a. First layer is receiving bytes from RS232C and assembling a full NMEA sentence. In my case I check for $ char and end char which is either “\n”, “\r” or “\r\n”. Very often I receive 1 char which is just a “$” char and number of bytes that together with the first “$” char constitute a sentence + another partial sentence starting with “$” and so on and so forth. In other words my first layer would split the string “$GPGGA,76\r$GPRMC,42” into two sentences “$GPGGA,76\r” and "$GPRMC,42”. The first sentence would be accepted as it has all required by NMEA components. Second sentence would be rejected (as it does not have termination char) unless in the next RS232C reading I receive “\n”/”r”/”\r\n” – in which case I would assemble one of “$GPRMC,42\r”/ “$GPRMC,42\n”/ “$GPRMC,*42\r\n”. b. After I assemble the sentence I use nmea.c module (I have renamed it as nmeaHelper.c – is that OK? According to your licence I have assumed it is) to check all components of the sentence – like correct checksum caller id etc. Is this the wrong approach of filtering NMEA sentences? Kind Regards, Janusz

kosma commented 5 years ago

When it comes to parsing UART data, my recommendation is to ignore NMEA completely on the lower layer and write a parser that just receives lines of text (and terminates on characters \r and \n). That's the way the example code does:

char line[MINMEA_MAX_LENGTH];
while (fgets(line, sizeof(line), stdin) != NULL) {
    switch (minmea_sentence_id(line, false)) {
        case MINMEA_SENTENCE_RMC: {
            ...
        }
    }
}

As you can see, fgets() has no concept of NMEA - it just reads lines of text. By analogy - minmea has no concept of splitting input stream into lines of text. These are two separate concepts, and keeping them separate makes for cleaner code.

(Actually, now that I think of it, minmea shouldn't have to handle newlines - these should be removed by the caller. I think I did implement newline handling in minmea because most line-based file interfaces in C leave newlines in the buffer, and I wanted to keep the code as simple as possible on the caller side. Removing the newlines would require an extra function call for no apparent gain. Yes, it's leaking abstractions a little bit.)

When it comes to the second sentence being rejected - it shouldn't happen; the entire test suite misses newlines at the end of sentences and they parse correctly. Maybe you're missing the zero terminator on the strings (or possibly have a buffer-overflow-by-one in your zero-termination code), and that's causing a parser error?

rejkid commented 5 years ago

Hi Kosma, Yep I agree, at the lower layer I have just implemented receiving function that does what fgets() method does (unfortunately I don’t have a luxury of calling fgets() for RS232C port – I had to implement it myself – it’s basically receiving chars until it gets ‘\n’ and then assembles a frame (sentence)). Then I use minmea for checking the frame for errors. I my case I was unnecessarily stopping the receiving process as soon as I encountered ‘\r’, but the spec says it clearly that the end of sentence is “\r\n”, so I stop receiving sentence when I receive ‘\n’ it works just fine. So just ignore my contribution. Thanks, Kind Regards, Janusz

kosma commented 5 years ago

I had a look at the spec and you're right - it mandates \r\n. So - I'm leaving the code as-is until we find a use case where the receiver (or the OS-supplied serial port library - I think old Macs maybe?) does actually use \r as terminator.

kosma / minmea

Suggestions from testing code with different GPS units #37