Openvario / sensord

Daemon to poll air data from Openvario sensorboard
6 stars 11 forks source link

Zero terminated NMEA sentences are not compatible with XCSoar (and probably others) #13

Open moser opened 4 years ago

moser commented 4 years ago

This commit introduced an additional zero byte at the end of the NMEA sentences sensord sends: https://github.com/Openvario/sensord/commit/8cc4a66ad1d8d7b71e78342ace9d61e8863bcd85

@bomilkar can you elaborate what the reasoning behind sending the trailing \0 is?

As far as I can see this breaks compatibility with most NMEA receivers (including XCSoar). That means that if you don't want to use variod in between sensord and XCSoar, it will not work anymore. The NMEA protocol (which does not have a good public standardization document) seems not to allow to send zero bytes: https://en.wikipedia.org/wiki/NMEA_0183#Message_structure

bomilkar commented 4 years ago

@moser it wasn't my intention at all to break compatibility. However, this piece of code in the commit is about sending a string which is well defined in C: A string in includes a terminating '\0' character. All string manipulation functions rely on that string termination. strlen() counts the number of characters between the beginning of the string and the terminating null character (without including the terminating null character itself). Hence I decided to send strlen()+1 characters.

BTW: the link to the NMEA message structure is nice to have. But I don't know if that would be an official standard which everybody should adhere to. For instance it says " ends the message." Honestly, I've seen everything: , , . , or nothing at all. As far as I know, XCSoar seems to expect a '\n'.

I'm currently working on variod and my intention is to make receiving NMEA sentences independent of how the sentence ends. It might be '\n', '\r', '\0' or any combination of the 3 at any length. The reason is that recv() may very well return multiple NMEA sentences per call and I want to make sure the parser can safely work on one full sentence at a time without any weak underlying assumptions.

I'm surprised to hear that somebody other than variod is using the output of sensord besides XCSoar, of course. As far as I can tell XCSoar doesn't care if there is a '\0' after the '\n' at the end of the sentence. If XCSoar chokes, then because sensord breaks and reestablishes the connection. But that is due to an other bug which had been fixed a week ago. So what makes you believe that the 0 termination is not compatible with XCSoar? Can you point me at the code?

moser commented 4 years ago

I noticed the problem because on my freshly built testing image (using XCSoar 7 preview 15 & sensord testing), the data from sensord could not be parsed by XCSoar. It shows up in the device monitor with a . before each sentence (that's probably XCSoar's way of showing a non-printable character there).

@moser it wasn't my intention at all to break compatibility. However, this piece of code in the commit is about sending a string which is well defined in C: A string in includes a terminating '\0' character. All string manipulation functions rely on that string termination. strlen() counts the number of characters between the beginning of the string and the terminating null character (without including the terminating null character itself). Hence I decided to send strlen()+1 characters.

You are right about the definition of a string in C(PP), but C is pretty much the only language that defines strings like this. Other languages have (imo) more sane definitions that do not rely on in-band signaling the length of a string. NMEA is just a line protocol. Thus, putting any language-specific things into the byte stream (like the \0) creates problems for software written in other languages.

BTW: the link to the NMEA message structure is nice to have. But I don't know if that would be an official standard which everybody should adhere to. For instance it says " ends the message." Honestly, I've seen everything: , , . , or nothing at all. As far as I know, XCSoar seems to expect a '\n'.

Ending the message in just \n is also not conforming to the standard as well, which seems to state it should be \r\n, but parsers seem to handle just \n as well.

I'm currently working on variod and my intention is to make receiving NMEA sentences independent of how the sentence ends. It might be '\n', '\r', '\0' or any combination of the 3 at any length. The reason is that recv() may very well return multiple NMEA sentences per call and I want to make sure the parser can safely work on one full sentence at a time without any weak underlying assumptions.

I'm surprised to hear that somebody other than variod is using the output of sensord besides XCSoar, of course. As far as I can tell XCSoar doesn't care if there is a '\0' after the '\n' at the end of the sentence. If XCSoar chokes, then because sensord breaks and reestablishes the connection. But that is due to an other bug which had been fixed a week ago. So what makes you believe that the 0 termination is not compatible with XCSoar? Can you point me at the code?

XCSoar's parser does not care about line endings, \0 or anything else terminating a NMEA sentence. In the received byte stream it just looks for the next $ which marks the beginning of the next sentence. After that it looks for the next * to find the end and reads the two bytes after * to get the checksum. That seems to be most robust way of parsing a stream of NMEA sentences because it only makes assumptions about the start and end of sentences... I guess you could do something similar on variod. In any case you need to make sure that you put some kind of buffer between the parser and the raw recv() because you never can be sure that you will only recv a single & complete sentence. Here is the implementation in XCSoar: https://github.com/XCSoar/XCSoar/blob/v7.0_preview15/src/Device/Util/NMEAReader.cpp#L51-L89 Also note that this is not using any C string functions. It just uses std::find which operates on any array like structures. That makes sense because from the parsers point of view that received data (although stored in char *) is not a string but rather a stream of bytes. It could contain multiple \0, so string functions would just consume part of the data.

moser commented 4 years ago

Reading my comment again, I understand that I contradict myself between the description of the XCSoar parser and what I observed :-) I will check again tonight. My point that it does not make sense to send \0 still stands ;-)

bomilkar commented 4 years ago

I'm thinking of sending a string. It just happens to contain a NMEA sentence. That's how I view it.

I guess you noticed already: all the above doesn't explain why XCSoar behaves like it does. Anything between 2 bytes after the asterisk and the next $ is ignored. So how do you explain that a 0 in that part would cause any issues?

May I suggest you look at what sensord really sends. The best way to do this is use netcat: "nc -l -p 4352". As I said: there a enough bugs in sensord to cause XCSoar to choke. I'd be very interested in your findings.

Using $ and * to identify sentences is actually a good and simple idea. I will use this for variod, too. But that assumes that the sentence must have a checksum, which isn't a hard requirement according to the wikipedia reference you supplied. @linuxianer99 Can we make the checksum mandatory???.

moser commented 4 years ago

I'm thinking of sending a string. It just happens to contain a NMEA sentence. That's how I view it.

Well, yes, but it should not be the language-specific represenation of a string but just a couple of bytes :-)

May I suggest you look at what sensord really sends. The best way to do this is use netcat: "nc -l -p 4352". As I said: there a enough bugs in sensord to cause XCSoar to choke. I'd be very interested in your findings.

I checked that. I guess you don't see the \0 in nc output? When I receive the bytes from sensord in python, there are \0 bytes in there, e.g. the \x00 before the second $POV.

root@openvario-7-CH070:~# python3
Python 3.7.7 (default, Apr 17 2020, 15:34:47)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import socket as _socket
>>> ssock = _socket.socket(_socket.AF_INET, _socket.SOCK_STREAM)
>>> ssock.bind(('127.0.0.1', 4353))
>>> ssock.listen(3)
>>> conn, _ = ssock.accept()
>>> conn.recv(100)
b'$POV,P,+974.21,Q,+0.00*41\n\x00$POV,E,+1.22*38\n\x00$POV,V,+19.32*13\n\x00$POV,P,+973.79,Q,+0.00*4B\n\x00$POV,E,+2.3'

Using $ and * to identify sentences is actually a good and simple idea. I will use this for variod, too. But that assumes that the sentence must have a checksum, which isn't a hard requirement according to the wikipedia reference you supplied. @linuxianer99 Can we make the checksum mandatory???.

The checksum is in the protocol because NMEA was originally meant for RS422 which is a raw serial connection that has the potential for bit flips. When sending NMEA data over TCP (a protocol that provides data integrity guarantees), including checksums sounds wasteful. On the other hand, on the XCSoar side the checksum is already mandatory (see the parser code I posted earlier).

bomilkar commented 4 years ago

Try more than 3 sentences. In this https://github.com/Openvario/variod/issues/15 was a "sensordata.log" provided. Use this, please. Again: what leads you to believe the 0 is the culprit??

linuxianer99 commented 4 years ago

Hey All, i think we should be compatible with the NMEA standard. So Wikipedia states:

Bei den NMEA-Daten handelt es sich um ASCII-basierte Datensätze, die jeweils 80 druckbare Zeichen umfassen können. Jeder Datensatz wird durch eine Kombination aus Wagenrücklauf und Zeilenvorschub abgeschlossen ().

Der Anfang eines Datensatzes wird durch ein „$“ oder „!“ markiert. Nach diesem Startzeichen folgt die Geräte-ID (normalerweise zwei Zeichen) und die Datensatz-ID (meist drei Zeichen) als eine Zeichenkette. Darauf folgen, jeweils durch Kommata getrennt, die Datenfelder gemäß der Datensatzdefinition. Optional kann zusätzlich eine durch ein „“ abgetrennte hexadezimale Prüfzahl angehängt werden. Diese wird durch die XOR-Verknüpfung der ASCII-Werte aller Zeichen zwischen dem $ und dem errechnet.

Also the length is limited to 80 chars.

Think we should code like this.

bomilkar commented 4 years ago

I agree, except that checksum should be mandatory. So far, I have not seen anything sending w/ checksum. We can send terminated sentences (with or without a 0), but on the receiving end I don't want to check for the presence or absence of . I'd rather ignore everything between 2 chars after the * and the next $ sign.