mie-lab / trackintel

trackintel is a framework for spatio-temporal analysis of movement trajectory and mobility data.
MIT License
198 stars 50 forks source link

ENH: overlapping triplegs #607

Closed munterfi closed 6 months ago

munterfi commented 6 months ago

This PR introduces a new method overlap_staypoints for generating triplegs.

Although the default method between_staypoints is suitable for most cases, there are instances where the temporal resolution of positionfixes is very coarse with an irregular interval between the positionfixes, resulting in large gaps between staypoints and triplegs. These are not actual missing data in the record, but rather a consequence of the coarse temporal resolution. Such occurrences can complicate further analysis.

Our use case involves GPS data of locomotives with positionfixes at irregular intervals, ranging from a few seconds to several hours when active, and possibly reducing to one positionfix per day when the locomotive is inactive for extended periods.

Adds

Approach

The logic follows the "between_staypoints" method, which already sets the "finished_at" time of the staypoint to the "tracked_at" timestamp of the following positionfix (which is the first positionfix of the next tripleg, and thereby also defines its "started_at" time). The "finished_at" time of the tripleg is now also extended to the "tracked_at" timestamp of the first positionfix of the next staypoint:

generate_triplegs_methods

To ensure a spatial overlap of the geometries, a new column "tripleg_id_geom" had to be introduced for the grouping the points of the positionfixes into a linestring.

id elevation tracked_at user_id accuracy staypoint_id tripleg_id tripleg_id_geom longitude latitude
591 44.196 2008-10-23 09:55:16+00:00 0 2 2 116.321781 40.009273
592 46.9392 2008-10-23 09:55:21+00:00 0 2 2 116.321871 40.009313
593 48.4632 2008-10-23 09:55:26+00:00 0 2 2 116.321952 40.009344
594 49.9872 2008-10-23 09:55:31+00:00 0 0 2 2 116.322007 40.009356
595 49.6824 2008-10-23 09:55:36+00:00 0 0 116.322103 40.009386
596 50.292 2008-10-23 09:55:41+00:00 0 0 116.322121 40.009356
597 50.292 2008-10-23 09:55:46+00:00 0 0 116.322135 40.009353
598 50.292 2008-10-23 09:55:51+00:00 0 0 116.322152 40.009319
599 51.2064 2008-10-23 09:55:56+00:00 0 0 116.322162 40.009394
600 51.2064 2008-10-23 09:56:01+00:00 0 0 116.322179 40.009399
601 50.9016 2008-10-23 09:56:06+00:00 0 0 116.32219 40.009344
602 50.5968 2008-10-23 09:56:11+00:00 0 0 116.322177 40.009342
603 49.9872 2008-10-23 09:56:16+00:00 0 0 116.32219 40.009318
604 49.3776 2008-10-23 09:56:21+00:00 0 0 116.322206 40.009287
605 50.292 2008-10-23 09:56:26+00:00 0 0 116.32222 40.00927
606 48.1584 2008-10-23 10:02:04+00:00 0 0 116.321916 40.009351
607 47.8536 2008-10-23 10:02:09+00:00 0 0 116.321838 40.009336
608 45.72 2008-10-23 10:02:14+00:00 0 0 116.321811 40.009331
609 46.3296 2008-10-23 10:02:19+00:00 0 0 116.321823 40.009314
610 46.0248 2008-10-23 10:02:24+00:00 0 0 116.321833 40.009314
611 45.72 2008-10-23 10:02:29+00:00 0 0 3 116.32185 40.009316
612 42.672 2008-10-23 10:03:39+00:00 0 3 3 116.320888 40.009428
613 52.4256 2008-10-23 10:03:44+00:00 0 3 3 116.321493 40.008854
614 54.864 2008-10-23 10:03:49+00:00 0 3 3 116.321349 40.008848

When a staypoint consists of only one positionfix, the previous tripleg will have a spatial overlap with that staypoint. However, the following tripleg will not spatially overlap with the staypoint. Otherwise, duplicating the positionfix of the staypoint would be necessary.

id elevation tracked_at user_id accuracy staypoint_id tripleg_id tripleg_id_geom longitude latitude
2683 66.4464 2008-10-24 00:14:47+00:00 1 13 13 116.324841 39.978782
2684 63.0936 2008-10-24 00:14:52+00:00 1 13 13 116.324875 39.978813
2685 62.7888 2008-10-24 00:14:57+00:00 1 13 13 116.325005 39.978826
2686 62.7888 2008-10-24 00:15:00+00:00 1 6 13 13 116.325174 39.978897
2687 61.2648 2008-10-24 00:20:39+00:00 1 14 14 116.325483 39.97896
2688 61.2648 2008-10-24 00:20:44+00:00 1 14 14 116.325549 39.979008
2689 61.2648 2008-10-24 00:20:49+00:00 1 14 14 116.325562 39.979019

Note: I have not explored how selecting the "overlap_staypoints" method impacts the later processing and analyses phases in the trackintel framework.

codecov[bot] commented 6 months ago

Codecov Report

Attention: Patch coverage is 98.93048% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 93.44%. Comparing base (7ee696e) to head (0524ef1).

Files Patch % Lines
trackintel/preprocessing/positionfixes.py 98.93% 0 Missing and 2 partials :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #607 +/- ## ========================================== + Coverage 93.40% 93.44% +0.04% ========================================== Files 33 33 Lines 2061 2076 +15 Branches 364 367 +3 ========================================== + Hits 1925 1940 +15 Misses 126 126 Partials 10 10 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

munterfi commented 6 months ago

Thanks for the feedback!

The reason why the tripleg_id_geom was introduced is, that the staypoint ends at the first position fix of the next tripleg, but the geometry has to include the last positionfix of the staypoint, to ensure overlapping.

So depending on if the temporal or spatial perspective is considered, the tripleg ids would be assigned differently. Unfortunately setting the last positionfix of the staypoint as the started_at time of the tripleg, would result in a negative duration between staypoint end and tripleg start. We could fix this by altering the end times of the already generated staypoints, but I am not sure if this complies with the intended process of the framework.

As a solution the temporal tripleg_id is now replaced with the spatial tripleg_id_geom.

hongyeehh commented 6 months ago

Thanks for the PR! I like the idea of overlapping tripleg generation and think the current code is quite well structured.

We need to add more docstring to explain our definition of triplegs' time and geometry, and more test cases for the new method. But we can do this in a separate PR as this one is already quite heavy.

Thanks for contributing to trackintel. I will merge this now!