martin-ueding / geo-activity-playground

Data analysis and visualization based on GPS tracked outdoor activities.
https://martin-ueding.github.io/geo-activity-playground/
MIT License
27 stars 12 forks source link

AttributeError: Can only use .dt accessor with datetimelike values. Did you mean: 'at'? #26

Closed pstorch closed 6 months ago

pstorch commented 7 months ago

I tried to read .gpx files from OpenTracks. There is the possibility to export all tracks in one file. Then I get this error on startup:

2023-11-26 11:51:59 geo_activity_playground.importers.directory INFO Didn't find a metadata file.
2023-11-26 11:51:59 geo_activity_playground.importers.directory INFO Parsing activity file Activities/OpenTracks-Backup.gpx …
Traceback (most recent call last):
  File "/home/peter/.local/bin/geo-activity-playground", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/peter/.local/pipx/venvs/geo-activity-playground/lib/python3.11/site-packages/geo_activity_playground/__main__.py", line 71, in main
    options.func(options)
  File "/home/peter/.local/pipx/venvs/geo-activity-playground/lib/python3.11/site-packages/geo_activity_playground/__main__.py", line 58, in <lambda>
    func=lambda options: webui_main(make_activity_repository(options.basedir))
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/peter/.local/pipx/venvs/geo-activity-playground/lib/python3.11/site-packages/geo_activity_playground/__main__.py", line 78, in make_activity_repository
    import_from_directory()
  File "/home/peter/.local/pipx/venvs/geo-activity-playground/lib/python3.11/site-packages/geo_activity_playground/importers/directory.py", line 36, in import_from_directory
    timeseries = read_activity(path)
                 ^^^^^^^^^^^^^^^^^^^
  File "/home/peter/.local/pipx/venvs/geo-activity-playground/lib/python3.11/site-packages/geo_activity_playground/core/activity_parsers.py", line 93, in read_activity
    df.time = df.time.dt.tz_convert(None)
              ^^^^^^^^^^
  File "/home/peter/.local/pipx/venvs/geo-activity-playground/lib/python3.11/site-packages/pandas/core/generic.py", line 6204, in __getattr__
    return object.__getattribute__(self, name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/peter/.local/pipx/venvs/geo-activity-playground/lib/python3.11/site-packages/pandas/core/accessor.py", line 224, in __get__
    accessor_obj = self._accessor(obj)
                   ^^^^^^^^^^^^^^^^^^^
  File "/home/peter/.local/pipx/venvs/geo-activity-playground/lib/python3.11/site-packages/pandas/core/indexes/accessors.py", line 608, in __new__
    raise AttributeError("Can only use .dt accessor with datetimelike values")
AttributeError: Can only use .dt accessor with datetimelike values. Did you mean: 'at'?

I don't know if it is really the "multiple tracks" file which causes this error or if just one of the many tracks I exported is responsible for it.

martin-ueding commented 7 months ago

Yes, the GPX format supports multiple tracks with multiple segments. So OpenTracks exports each activity as one track? Are there names for activities included somewhere?

At the moment I just load all points from all segments from all tracks:

def read_gpx_activity(path: pathlib.Path, open) -> pd.DataFrame:
    points = []
    with open(path) as f:
        gpx = gpxpy.parse(f)
        for track in gpx.tracks:
            for segment in track.segments:
                for point in segment.points:
                    points.append((point.time, point.latitude, point.longitude))

    return pd.DataFrame(points, columns=["time", "latitude", "longitude"])

So that part should be fine, you end up with one huge activity. It wouldn't be what you want, but it should work.

It is a bit strange that it crashes. I'd need to take a look into the data types. If you want, you could just send me your GPX file to mu@martin-ueding.de and I'll take a look. If you don't want to send me your whole activity database, that's fine as well. I've made a small change to the code such that we get a bit more debug output. Just do a git pull and try again. Then there should be some additional output of the parsed data as a table and also the data types of the types. I would be interested what the time column is marked as.

pstorch commented 7 months ago

So OpenTracks exports each activity as one track?

You can configure it. Either export all activities separately or all in one file.

Are there names for activities included somewhere?

Yes, it's in the "type" element and once more in a custom element "opentracks:typeTranslated" with a translated name of the activity. It's the same here because my system language is english.

<trk>
<name><![CDATA[2022-07-26T07:54+02]]></name>
<desc><![CDATA[]]></desc>
<type><![CDATA[running]]></type>
<extensions>
<topografix:color>c0c0c0</topografix:color>
<opentracks:trackid>33a42aa4-d022-412e-a8b8-412ab9c8d187</opentracks:trackid>
<opentracks:typeTranslated><![CDATA[running]]></opentracks:typeTranslated>
<gpxtrkx:TrackStatsExtension>
<gpxtrkx:Distance>11203.861328125</gpxtrkx:Distance>
<gpxtrkx:TimerTime>5115</gpxtrkx:TimerTime>
<gpxtrkx:MovingTime>5001</gpxtrkx:MovingTime>
<gpxtrkx:StoppedTime>114</gpxtrkx:StoppedTime>
<gpxtrkx:MaxSpeed>2.5</gpxtrkx:MaxSpeed>
<gpxtrkx:Ascent>111.0</gpxtrkx:Ascent>
<gpxtrkx:Descent>111.0</gpxtrkx:Descent>
</gpxtrkx:TrackStatsExtension>
</extensions>
<trkseg>

So that part should be fine, you end up with one huge activity. It wouldn't be what you want, but it should work.

That's fine. I'll export all activities separately now. This was just a test. The .gpx file gets also very huge (~150MB).

If you still want to dig into this error: here is the new log. I anonymized the location a bit xxxxxx and yyyyyy were normal digits.

2023-11-26 14:40:18 geo_activity_playground.core.config WARNING Missing a config, some features might be missing.
2023-11-26 14:40:18 geo_activity_playground.importers.directory INFO Loading metadata file …
2023-11-26 14:40:18 geo_activity_playground.importers.directory INFO Parsing activity file Activities/OpenTracks-Backup.gpx …
                                    time   latitude  longitude
0       2022-07-26 07:54:58.610000+02:00  50.xxxxxx   9.yyyyyy
1       2022-07-26 07:55:02.577000+02:00  50.xxxxxx   9.yyyyyy
2       2022-07-26 07:55:04.610000+02:00  50.xxxxxx   9.yyyyyy
3       2022-07-26 07:55:06.645000+02:00  50.xxxxxx   9.yyyyyy
4       2022-07-26 07:55:07.648000+02:00  50.xxxxxx   9.yyyyyy
...                                  ...        ...        ...
543494  2023-11-19 14:47:22.174000+01:00  50.xxxxxx   9.yyyyyy
543495  2023-11-19 14:47:27.174000+01:00  50.xxxxxx   9.yyyyyy
543496  2023-11-19 14:47:32.170000+01:00  50.xxxxxx   9.yyyyyy
543497  2023-11-19 14:47:41.153000+01:00  50.xxxxxx   9.yyyyyy
543498  2023-11-19 14:47:41.164000+01:00  50.xxxxxx   9.yyyyyy

[543499 rows x 3 columns]
time          object
latitude     float64
longitude    float64
dtype: object
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/peter/git/geo-activity-playground/geo_activity_playground/__main__.py", line 71, in main
    options.func(options)
  File "/home/peter/git/geo-activity-playground/geo_activity_playground/__main__.py", line 58, in <lambda>
    func=lambda options: webui_main(make_activity_repository(options.basedir))
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/peter/git/geo-activity-playground/geo_activity_playground/__main__.py", line 78, in make_activity_repository
    import_from_directory()
  File "/home/peter/git/geo-activity-playground/geo_activity_playground/importers/directory.py", line 36, in import_from_directory
    timeseries = read_activity(path)
                 ^^^^^^^^^^^^^^^^^^^
  File "/home/peter/git/geo-activity-playground/geo_activity_playground/core/activity_parsers.py", line 94, in read_activity
    df.time = df.time.dt.tz_convert(None)
              ^^^^^^^^^^
  File "/home/peter/.cache/pypoetry/virtualenvs/geo-activity-playground-uIcfKzN9-py3.11/lib/python3.11/site-packages/pandas/core/generic.py", line 6204, in __getattr__
    return object.__getattribute__(self, name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/peter/.cache/pypoetry/virtualenvs/geo-activity-playground-uIcfKzN9-py3.11/lib/python3.11/site-packages/pandas/core/accessor.py", line 224, in __get__
    accessor_obj = self._accessor(obj)
                   ^^^^^^^^^^^^^^^^^^^
  File "/home/peter/.cache/pypoetry/virtualenvs/geo-activity-playground-uIcfKzN9-py3.11/lib/python3.11/site-packages/pandas/core/indexes/accessors.py", line 608, in __new__
    raise AttributeError("Can only use .dt accessor with datetimelike values")
AttributeError: Can only use .dt accessor with datetimelike values. Did you mean: 'at'?
martin-ueding commented 7 months ago

Extracting the metadata is something I've put into a separate ticket: https://github.com/martin-ueding/geo-activity-playground/issues/32

The time thing might be fixed on the master version now. I've added a new dependency. Can you do a git pull and a poetry install and try again?

pstorch commented 7 months ago

It looks like this now:


2023-11-26 20:43:26 geo_activity_playground.core.config WARNING Missing a config, some features might be missing.
2023-11-26 20:43:26 geo_activity_playground.importers.directory INFO Loading metadata file …
2023-11-26 20:43:26 geo_activity_playground.importers.directory INFO Parsing activity file Activities/OpenTracks-Backup.gpx …
                                    time   latitude  longitude
0       2022-07-26 07:54:58.610000+02:00  50.xxxxxx   9.yyyyyy
1       2022-07-26 07:55:02.577000+02:00  50.xxxxxx   9.yyyyyy
2       2022-07-26 07:55:04.610000+02:00  50.xxxxxx   9.yyyyyy
3       2022-07-26 07:55:06.645000+02:00  50.xxxxxx   9.yyyyyy
4       2022-07-26 07:55:07.648000+02:00  50.xxxxxx   9.yyyyyy
...                                  ...        ...        ...
543494  2023-11-19 14:47:22.174000+01:00  50.xxxxxx   9.yyyyyy
543495  2023-11-19 14:47:27.174000+01:00  50.xxxxxx   9.yyyyyy
543496  2023-11-19 14:47:32.170000+01:00  50.xxxxxx   9.yyyyyy
543497  2023-11-19 14:47:41.153000+01:00  50.xxxxxx   9.yyyyyy
543498  2023-11-19 14:47:41.164000+01:00  50.xxxxxx   9.yyyyyy

[543499 rows x 3 columns]
time          object
latitude     float64
longitude    float64
dtype: object
2023-11-26 20:43:55 geo_activity_playground.importers.directory ERROR Error while parsing file Activities/OpenTracks-Backup.gpx:
Traceback (most recent call last):
  File "/home/peter/git/geo-activity-playground/geo_activity_playground/core/activity_parsers.py", line 157, in read_activity
    if df.time.dt.tz is not None:
       ^^^^^^^^^^
  File "/home/peter/.cache/pypoetry/virtualenvs/geo-activity-playground-uIcfKzN9-py3.11/lib/python3.11/site-packages/pandas/core/generic.py", line 6204, in __getattr__
    return object.__getattribute__(self, name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/peter/.cache/pypoetry/virtualenvs/geo-activity-playground-uIcfKzN9-py3.11/lib/python3.11/site-packages/pandas/core/accessor.py", line 224, in __get__
    accessor_obj = self._accessor(obj)
                   ^^^^^^^^^^^^^^^^^^^
  File "/home/peter/.cache/pypoetry/virtualenvs/geo-activity-playground-uIcfKzN9-py3.11/lib/python3.11/site-packages/pandas/core/indexes/accessors.py", line 608, in __new__
    raise AttributeError("Can only use .dt accessor with datetimelike values")
AttributeError: Can only use .dt accessor with datetimelike values

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/peter/git/geo-activity-playground/geo_activity_playground/importers/directory.py", line 39, in import_from_directory
    timeseries = read_activity(path)
                 ^^^^^^^^^^^^^^^^^^^
  File "/home/peter/git/geo-activity-playground/geo_activity_playground/core/activity_parsers.py", line 162, in read_activity
    raise ActivityParseError(
geo_activity_playground.core.activity_parsers.ActivityParseError: It looks like the date parsing has gone wrong.
2023-11-26 20:43:55 geo_activity_playground.importers.directory WARNING There were errors while parsing some of the files. These were skipped and tried again next time.
2023-11-26 20:43:55 geo_activity_playground.importers.directory ERROR Activities/OpenTracks-Backup.gpx: It looks like the date parsing has gone wrong.
 * Serving Flask app 'geo_activity_playground.webui.app'
 * Debug mode: off
2023-11-26 20:43:55 werkzeug INFO WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on http://127.0.0.1:5000
2023-11-26 20:43:55 werkzeug INFO Press CTRL+C to quit
pstorch commented 7 months ago

Maybe it's a memory issue. This one big file contains 1027776 trackpoints. And the log above only reports 543499 rows. I guess it's truncated somewhere.

As this is not a useful usecase for geo-activity-playground feel free to close this ticket as "won't fix". I'll go with the individuals files (one file per activity). But maybe it would be good to limit the amount of trackpoints parsed and give the user a warning that the file is too big.

martin-ueding commented 7 months ago

The issue is something with the date formats in the file. It could be either that all the dates are in some weird format. Or it could be that some are time-zone aware and some are not.

Does it work with the individual files?

pstorch commented 7 months ago

@martin-ueding hab dir die Datei mal per Mail geschickt.

martin-ueding commented 6 months ago

I've fixed the date loading, the files work now.