mie-lab / trackintel

trackintel is a framework for spatio-temporal analysis of movement trajectory and mobility data.
MIT License
199 stars 50 forks source link

Add optional keyword argument to __create_new_staypoints #619

Open hengoren opened 4 months ago

hengoren commented 4 months ago

Hello Trackintel team,

My team is currently using your library to detect staypoints from positionfixes and we have made a local patch that we would like to submit as a feature of the trackintel library.

Enhancement Description

We propose adding a new keyword argument, "exact" (would be open to alternate names), to the __create_new_staypoints function. This argument would change the behavior of how new staypoints are created and the geometry that is assigned to new staypoints.

Current Behavior

Currently, a new staypoint is created at the centroid of the unary_union of all positionfixes (when a planar projection system is defined)

Proposed Behavior with "exact" flag

When the "exact" flag is set to True, a new staypoint would occur at the mode of all positionfixes, rather than the centroid.

Rationale

We are currently working with noiseless simulated data. Since the data is noiseless, that means the staypoint locations we detect using the mode give us the exact location of the staypoint. For our downstream algorithms, knowing the precise location an agent stays at is beneficial

Example

We modified the function on an earlier version of trackintel (1.2.4), but here's a code sample (this obviously isn't compatible with check_gdf_planar, which wasn't relevant in our use case)

def __create_new_staypoints(start, end, pfs, elevation_flag, geo_col, last_flag=False, exact=True):
    """Create a staypoint with relevant infomation from start to end pfs."""
    new_sp = {}

    # Here we consider pfs[end] time for stp 'finished_at', but only include
    # pfs[end - 1] for stp geometry and pfs linkage.
    new_sp["started_at"] = pfs["tracked_at"].iloc[start]
    new_sp["finished_at"] = pfs["tracked_at"].iloc[end]

    # if end is the last pfs, we want to include the info from it as well
    if last_flag:
        end = len(pfs)
    points = pfs[geo_col].iloc[start:end].unary_union

    if exact:
        xy = shp_get_coordinates(pfs[geo_col].iloc[start:end])
        xym = sp_mode(xy, axis=0)[0]
        crs = None if isinstance(points, BaseGeometry) else points.crs
        new_sp[geo_col] = gpd.points_from_xy(np.atleast_1d(xym[0]), np.atleast_1d(xym[1]), crs=crs)[0]
        new_sp["pfs_id"] = pfs.index[start:end].to_list()
        return new_sp

    if check_gdf_planar(pfs):
        new_sp[geo_col] = points.centroid
    else:
        new_sp[geo_col] = angle_centroid_multipoints(points)[0]

    if elevation_flag:
        new_sp["elevation"] = pfs["elevation"].iloc[start:end].median()
    new_sp["pfs_id"] = pfs.index[start:end].to_list()

    return new_sp
hongyeehh commented 4 months ago

Hi, and thanks for your interest in trackintel!

Could you provide more context for why you would like to "detect" staypoint if the data is noiseless? Concretely, the reason for the spatial aggregation.

Staypoint is initially defined as spatially and temporally aggregated records. Spatial perspective is necessary because of recording noise; thus, exact coordinates cannot be obtained. If the data is noiseless, one can simply aggregate based on time, which means passing 0 (or a small value) to dist_threshold.