mapillary / mapillary_tools

Command line tools for processing and uploading Mapillary imagery
BSD 2-Clause "Simplified" License
255 stars 134 forks source link

Handle Garmin GPX Conversion #416

Open cbeddow opened 3 years ago

cbeddow commented 3 years ago

User reports: “ERROR: mapillary_tools process not possible when parameter = --geotag_source gpx”.

User finds the issue is that the specific Garmin device's GPX seems to not match the format that mapillary_tools expects:

To convert the GPX file:

Now the GPX input file will be processed correctly

^^ following this, we should consider either a handler with the GPX process fails to then run this conversion, or a handler before attempting to process the GPX to run this conversion or see if the headers exist and should be removed, and if not in UTF8 to then convert it.

@ptpt could you take a look at this in a next update?

cbeddow commented 3 years ago

Additional notes:

File NOT OK = Windows (CR LF)

File NOT OK = Unix (LF)

I don’t know – at this moment – if (CR LF) or (LF) is relevant.

But for Unix (LF) seems best option, always accepted.

BOM might be the only issue!

BOM is first 3 – invisible - Characters in a file: Chr(239) & Chr(187) & Chr(191)

These 3 characters have to be removed!

This is the Error Message: “ERROR:root:not well-formed (invalid token): line 1, column 1”

IMO: line 1, column 1 (+2, +3) = BOM

Remark:

pbb72 commented 3 years ago

Just to add a small nuance: BOM (Byte Order Mark) is not a requirement for Windows. In Windows, interpreting the correct file encoding is up to the software, not the OS. However, you were almost correct, almost all Microsoft software and compilers require and use the BOM for UTF-8 files. So while it's not a requirement for Windows software, it is the de facto default...

Also note that while the Unicode Standard does not recommend using a BOM for UTF-8, it does allow it, so it is perfectly valid.

AnkEric commented 2 years ago

Steps to reproduce the Error:

mapillary_tools process “M:\Mapillary\GPX_BASECAMP_BOM” --geotag_source gpx --geotag_source_path “GPX_BASECAMP_BOM.gpx”

If [gpx file] is directly copied from GPS device (f.i. “2021-10-12 15.02.gpx”) then [gpx file] is in “UTF-8, no BOM” format. This is “perfectly valid” for “mapillary_tools process”.

If, however, [gpx file] is first uploaded to BaseCamp en then exported as [gpx file] from BaseCamp (f.i. “GPX_BASECAMP_BOM.gpx”), the exported [gpx file] is in “UTF-8, BOM” format. So BOM was added to [gpx file] by BaseCamp. Which is “perfectly valid” for UTF-8, but not “perfectly valid” for “mapillary_tools process”.

BOM in [gpx file] is not accepted as valid by “mapillary_tools process”: "ERROR:root:not well-formed (invalid token)".

“mapillary_tools:" version 0.8.0 (or older version).

To resolve by “mapillary_tools":

either remove BOM if present in [gpx file], or “\gpxpy\parser.py" should accept BOM as being valid in a [gpx file].

Error Message:

ERROR:root:not well-formed (invalid token): line 1, column 1

Traceback (most recent call last): File "C:\Python39\lib\site-packages\gpxpy\parser.py", line 196, in parse self.xml_parser = XMLParser(self.xml) File "C:\Python39\lib\site-packages\gpxpy\parser.py", line 43, in init self.dom = mod_minidom.parseString(xml) File "C:\Python39\lib\xml\dom\minidom.py", line 1998, in parseString return expatbuilder.parseString(string) File "C:\Python39\lib\xml\dom\expatbuilder.py", line 925, in parseString return builder.parseString(string) File "C:\Python39\lib\xml\dom\expatbuilder.py", line 223, in parseString parser.Parse(string, True) xml.parsers.expat.ExpatError: not well-formed (invalid token): line 1, column 1 Traceback (most recent call last): File "C:\Python39\lib\site-packages\gpxpy\parser.py", line 196, in parse self.xml_parser = XMLParser(self.xml) File "C:\Python39\lib\site-packages\gpxpy\parser.py", line 43, in init self.dom = mod_minidom.parseString(xml) File "C:\Python39\lib\xml\dom\minidom.py", line 1998, in parseString return expatbuilder.parseString(string) File "C:\Python39\lib\xml\dom\expatbuilder.py", line 925, in parseString return builder.parseString(string) File "C:\Python39\lib\xml\dom\expatbuilder.py", line 223, in parseString parser.Parse(string, True) xml.parsers.expat.ExpatError: not well-formed (invalid token): line 1, column 1

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:\Python39\lib\runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Python39\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Python39\Scripts\mapillary_tools.exe__main__.py", line 7, in File "C:\Python39\lib\site-packages\mapillary_tools__main.py", line 123, in main args.func(vars(args)) File "C:\Python39\lib\site-packages\mapillary_tools\commands\process.py", line 231, in run process_geotag_properties( File "C:\Python39\lib\site-packages\mapillary_tools\process_geotag_properties.py", line 37, in process_geotag_properties return processing.geotag_from_gpx_file( File "C:\Python39\lib\site-packages\mapillary_tools\processing.py", line 270, in geotag_from_gpx_file gps_trace = get_lat_lon_time_from_gpx(geotag_source_path) File "C:\Python39\lib\site-packages\mapillary_tools\gps_parser.py", line 16, in get_lat_lon_time_from_gpx gpx = gpxpy.parse(f) File "C:\Python39\lib\site-packages\gpxpy\init__.py", line 32, in parse return parser.parse() File "C:\Python39\lib\site-packages\gpxpy\parser.py", line 219, in parse raise mod_gpx.GPXXMLSyntaxException('Error parsing XML: %s' % str(e), e) gpxpy.gpx.GPXXMLSyntaxException: Error parsing XML: not well-formed (invalid token): line 1, column 1

ptpt commented 2 years ago

@AnkEric could you check if it's fixed in https://github.com/mapillary/mapillary_tools/pull/446 (which upgrade the gpx library to latest) while you are extensively testing mapillary_tools ;)

AnkEric commented 2 years ago

@AnkEric could you check if it's fixed in #446 (which upgrade the gpx library to latest) while you are extensively testing mapillary_tools ;)

I could if I would know how to get this: "which upgrade the gpx library to latest"".

It's not in my recent update:

python3 -m pip install --force-reinstall --upgrade git+https://github.com/mapillary/mapillary_tools

(Or: it's not fixed...)

ptpt commented 2 years ago

Could you check:

python3 -m pip show gpxpy
Name: gpxpy
Version: 1.4.2
Summary: GPX file parser and GPS track manipulation library
Home-page: https://github.com/tkrajina/gpxpy
Author: Tomo Krajina
...

It should be 1.4.2. If not

python3 -m pip install -U gpxpy
AnkEric commented 2 years ago

e:_APPS\Mapillary\mapillary_tools>Python -m pip show gpxpy Name: gpxpy Version: 1.4.2 Summary: GPX file parser and GPS track manipulation library Home-page: https://github.com/tkrajina/gpxpy

So: it's not fixed.

But Error Msg is different and shorter (see below).

If I do remove BOM in <> before processing, then it's okay.

mapillary_tools process G:\2_mapillary\20210612_GPX_BOM_TEST --skip_subfolders --geotag_source_path G:\2_mapillary\20210612_GPX_BOM_TEST\Woerden-Soest_12-06-2021_51km_BOM.gpx --geotag_source gpx --interpolate_directions

Traceback (most recent call last): File "C:\Python39\lib\site-packages\gpxpy\parser.py", line 134, in parse root = mod_etree.XML(self.xml) File "C:\Python39\lib\xml\etree\ElementTree.py", line 1347, in XML parser.feed(text) xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 1

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:\Python39\lib\runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Python39\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Python39\Scripts\mapillary_tools.exe__main.py", line 7, in File "C:\Python39\lib\site-packages\mapillary_tools__main__.py", line 123, in main args.func(vars(args)) File "C:\Python39\lib\site-packages\mapillary_tools\commands\process.py", line 238, in run descs = process_geotag_properties( File "C:\Python39\lib\site-packages\mapillary_tools\process_geotag_properties.py", line 55, in process_geotag_properties geotag = geotag_from_gpx_file.GeotagFromGPXFile( File "C:\Python39\lib\site-packages\mapillary_tools\geotag\geotag_from_gpx_file.py", line 21, in init__ points = get_lat_lon_time_from_gpx(source_path) File "C:\Python39\lib\site-packages\mapillary_tools\geotag\geotag_from_gpx_file.py", line 33, in get_lat_lon_time_from_gpx gpx = gpxpy.parse(f) File "C:\Python39\lib\site-packages\gpxpy__init__.py", line 39, in parse return parser.parse(version) File "C:\Python39\lib\site-packages\gpxpy\parser.py", line 146, in parse raise mod_gpx.GPXXMLSyntaxException('Error parsing XML: %s' % str(e), e) gpxpy.gpx.GPXXMLSyntaxException: Error parsing XML: not well-formed (invalid token): line 1, column 1