tfeldmann / gpsdclient

A simple gpsd client and python library.
MIT License
56 stars 7 forks source link

gpsdclient chokes on some gpsd output #1

Closed pbrier closed 3 years ago

pbrier commented 3 years ago

OBSERVED json.loads() can fail on some (valid) output from GPSD

REPRODUCE Add a NTRIP source to gpsd and it will emit all kind of (valid) data that is not handled correctly by gpsdclient.

CAUSE GPSD can emit trailing commas in the json data. This will cause the standard python json parser to choke. See also: https://stackoverflow.com/questions/56592689/python-remove-comma-of-last-object-in-a-string-for-valid-json

PROPOSED FIX Use json5 or yaml functions to parse the data.

tfeldmann commented 3 years ago

Thank you for writing this issue, I wasn't aware of that. I'll look into json5 as parsing as yaml feels like a hack.

tfeldmann commented 3 years ago

Do you have some example NTRIP output so I can write a test case for this? You can get the raw output with gpsdclient --json

pbrier commented 3 years ago
import json
import json5
import yaml

line = '''{"class":"RTCM3","device":"ntrip://t:t","type":1012,"length":154,"station_id":0,"tow":72300000,"sync":"true","smoothing":"false","interval":"0","satellites":[{"ident":3,"channel":5,"L1":{"ind":0,"prange":33345.44,"delta":6.4610,"lockt":127,"amb":32,"CNR":47.00},"L2":{"ind":0,"prange":  322.80,"delta":8.4180,"lockt":255,"CNR":40.00},},{"ident":20,"channel":2,"L1":{"ind":0,"prange":243814.84,"delta":-72.1360,"lockt":0,"amb":40,"CNR":24.00},"L2":{"ind":0,"prange":  324.80,"delta":-112.2085,"lockt":2,"CNR":30.00},},{"ident":4,"channel":6,"L1":{"ind":0,"prange":295983.58,"delta":13.0660,"lockt":127,"amb":34,"CNR":46.00},"L2":{"ind":0,"prange":  320.98,"delta":8.5665,"lockt":255,"CNR":40.00},},{"ident":22,"channel":4294967293,"L1":{"ind":0,"prange":243703.68,"delta":-24.8170,"lockt":3,"amb":39,"CNR":29.00},"L2":{"ind":0,"prange":    0.00,"delta":-262.1440,"lockt":255,"CNR":0.00},},{"ident":21,"channel":4,"L1":{"ind":0,"prange":206239.12,"delta":-4.5475,"lockt":127,"amb":38,"CNR":41.00},"L2":{"ind":0,"prange":  321.58,"delta":-4.7015,"lockt":255,"CNR":39.00},},{"ident":2,"channel":4294967292,"L1":{"ind":0,"prange":323481.82,"delta":10.3405,"lockt":127,"amb":36,"CNR":47.00},"L2":{"ind":0,"prange":  325.48,"delta":10.1215,"lockt":255,"CNR":40.00},},{"ident":14,"channel":4294967289,"L1":{"ind":0,"prange":52326.40,"delta":17.5900,"lockt":127,"amb":36,"CNR":45.00},"L2":{"ind":0,"prange":  326.24,"delta":12.5120,"lockt":255,"CNR":41.00},},{"ident":12,"channel":4294967295,"L1":{"ind":0,"prange":64926.92,"delta":12.8205,"lockt":127,"amb":35,"CNR":47.00},"L2":{"ind":0,"prange":  323.88,"delta":18.3360,"lockt":255,"CNR":42.00},},{"ident":13,"channel":4294967294,"L1":{"ind":0,"prange":449774.66,"delta":7.7130,"lockt":127,"amb":31,"CNR":36.00},"L2":{"ind":0,"prange":  323.34,"delta":6.3105,"lockt":255,"CNR":34.00},}]}'''
print( json5.loads(line))
print( yaml.safe_load(line))
print( json.loads(line))
pbrier commented 3 years ago

This is an example line, you can generate more if you enable an ntrip source. For example: add this device to gpsd:

ntrip://t:t@rtk2go.com:2101/Basisdepeel

pbrier commented 3 years ago

There seems to be a performance hit when you use json5 (python's own json seems to be efficient). Have to see if that is a huge problem. Some links:

https://pypi.org/project/json5/
https://stackoverflow.com/questions/27743711/can-i-speedup-yaml
tfeldmann commented 3 years ago

Seems so. Eliminating the trailing commas with a regex would be faster but feels fragile.

I guess it's not that much incoming json so slower parsing performance should be ok.

pbrier commented 3 years ago

Others have tried the regex-approach. With some caveats:

https://stackoverflow.com/questions/52636846/python-cant-parse-json-with-extra-trailing-comma

https://gist.github.com/liftoff/ee7b81659673eca23cd9fc0d8b8e68b7

Another thing to consider is availability of yaml and json5 on a platform. Any thoughts on that?

that much of Data is relative. depends on the GPS rate (1Hz, 10Hz, 100Hz or more?) and the RTCM data seems to generate a lot of traffic. We can filter that before JSON conversion by just checking the relevant classes with a string search.

pbrier commented 3 years ago

I was just thinking: why not ask GPSD to fix the issue? I can create an issue there and see what happens. For the time being, filtering on only the TPV class (that seems to be formatted OK) before the JSON conversion, or using json5/yaml would be a quick fix on my side.

tfeldmann commented 3 years ago

Yes, good idea. I'm in the middle of implementing and testing the regex at the moment 👍

pbrier commented 3 years ago

That seems to do the job! Have to see the performance. Decoding all jsons with a lot of RTCM trafic seems a bad idea anyway. In that case, only filtering on TPV class messages (on my side) would be better.

Anyway, I submitted an issue with gpsd: https://gitlab.com/gpsd/gpsd/-/issues/169

Thanks!

tfeldmann commented 3 years ago

Thanks for raising the issue there. The performance should be fine, I'm doing just a small regex.

Regarding the filtering: I just released v1.2.0 to PyPI. You can now filter by report classes like this:

for result in client.dict_stream(filter=["TPV", "SKY"]):
    print(result)

Have a nice day!

pbrier commented 3 years ago

Nice! BTW: seems that they fixed this issue already in GPSD > 3.20. My distro (Ubuntu 20.04) has an old GPSD, I'll also try with a newer one. But the lib is now robust for old versions also!