martin-ueding / geo-activity-playground

Data analysis and visualization based on GPS tracked outdoor activities.
https://martin-ueding.github.io/geo-activity-playground/
MIT License
27 stars 12 forks source link

Feature request: skip errornous .gpx files #29

Closed pstorch closed 7 months ago

pstorch commented 7 months ago

Sorry to bombard you with Issues. I've now tried to use single .gpx files exported from OpenTracks (250 files). Some of them have problems. I have to investigate why this happens. I now have to start geo-activity-playground, look at the error, move the errornous file out of the activities directory and start again until the next file fails.

Can geo-activity-playground skip errornous files? Still log them, but start anyway.

2023-11-26 14:53:37 geo_activity_playground.core.config WARNING Missing a config, some features might be missing.
2023-11-26 14:53:37 geo_activity_playground.importers.directory INFO Loading metadata file …
2023-11-26 14:53:37 geo_activity_playground.importers.directory INFO Parsing activity file Activities/2020-06-25_17_02_40_2020-06-25 19_02.gpx …
Traceback (most recent call last):
  File "/home/peter/.cache/pypoetry/virtualenvs/geo-activity-playground-uIcfKzN9-py3.11/lib/python3.11/site-packages/gpxpy/parser.py", line 134, in parse
    root = mod_etree.XML(self.xml)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/xml/etree/ElementTree.py", line 1338, in XML
    parser.feed(text)
xml.etree.ElementTree.ParseError: junk after document element: line 8931, column 0

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/peter/git/geo-activity-playground/geo_activity_playground/__main__.py", line 71, in main
    options.func(options)
  File "/home/peter/git/geo-activity-playground/geo_activity_playground/__main__.py", line 58, in <lambda>
    func=lambda options: webui_main(make_activity_repository(options.basedir))
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/peter/git/geo-activity-playground/geo_activity_playground/__main__.py", line 78, in make_activity_repository
    import_from_directory()
  File "/home/peter/git/geo-activity-playground/geo_activity_playground/importers/directory.py", line 36, in import_from_directory
    timeseries = read_activity(path)
                 ^^^^^^^^^^^^^^^^^^^
  File "/home/peter/git/geo-activity-playground/geo_activity_playground/core/activity_parsers.py", line 87, in read_activity
    df = read_gpx_activity(path, open)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/peter/git/geo-activity-playground/geo_activity_playground/core/activity_parsers.py", line 68, in read_gpx_activity
    gpx = gpxpy.parse(f)
          ^^^^^^^^^^^^^^
  File "/home/peter/.cache/pypoetry/virtualenvs/geo-activity-playground-uIcfKzN9-py3.11/lib/python3.11/site-packages/gpxpy/__init__.py", line 39, in parse
    return parser.parse(version)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/peter/.cache/pypoetry/virtualenvs/geo-activity-playground-uIcfKzN9-py3.11/lib/python3.11/site-packages/gpxpy/parser.py", line 146, in parse
    raise mod_gpx.GPXXMLSyntaxException(f'Error parsing XML: {e}', e)
gpxpy.gpx.GPXXMLSyntaxException: Error parsing XML: junk after document element: line 8931, column 0
pstorch commented 7 months ago

28 of my 250 files had this error.

Nice: after fixing them, geo-activity-playground just read those files and added them, without reparsing all :+1:

martin-ueding commented 7 months ago

That is a very sensible suggestion. Now the output is this:

2023-11-26 17:34:54 geo_activity_playground.importers.directory INFO Loading metadata file …
2023-11-26 17:34:54 geo_activity_playground.importers.directory INFO Parsing activity file Activities/9876448717.tcx.gz …
2023-11-26 17:34:54 geo_activity_playground.importers.directory ERROR Error while parsing file Activities/9876448717.tcx.gz:
Traceback (most recent call last):
  File "/home/mu/Projekte/geo-activity-playground/geo_activity_playground/importers/directory.py", line 38, in import_from_directory
    timeseries = read_activity(path)
  File "/home/mu/Projekte/geo-activity-playground/geo_activity_playground/core/activity_parsers.py", line 85, in read_activity
    raise NotImplementedError(f"Unknown suffix: {path}")
NotImplementedError: Unknown suffix: Activities/9876448717.tcx.gz
2023-11-26 17:34:54 geo_activity_playground.importers.directory INFO Parsing activity file Activities/9890487249.tcx …
2023-11-26 17:34:54 geo_activity_playground.importers.directory ERROR Error while parsing file Activities/9890487249.tcx:
Traceback (most recent call last):
  File "/home/mu/Projekte/geo-activity-playground/geo_activity_playground/importers/directory.py", line 38, in import_from_directory
    timeseries = read_activity(path)
  File "/home/mu/Projekte/geo-activity-playground/geo_activity_playground/core/activity_parsers.py", line 91, in read_activity
    raise NotImplementedError(f"Unknown suffix: {path}")
NotImplementedError: Unknown suffix: Activities/9890487249.tcx
2023-11-26 17:34:54 geo_activity_playground.importers.directory WARNING There were errors while parsing some of the files. These were skipped and tried again next time.
2023-11-26 17:34:54 geo_activity_playground.importers.directory ERROR Activities/9876448717.tcx.gz: Unknown suffix: Activities/9876448717.tcx.gz
2023-11-26 17:34:54 geo_activity_playground.importers.directory ERROR Activities/9890487249.tcx: Unknown suffix: Activities/9890487249.tcx
pstorch commented 7 months ago

Thanks :+1: