e-mission / e-mission-docs

Repository for docs and issues. If you need help, please file an issue here. Public conversations are better for open source projects than private email.
https://e-mission.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
15 stars 32 forks source link

AttributeError: 'DataFrame' object has no attribute 'ts' during trip segmentation #814

Open shankari opened 1 year ago

shankari commented 1 year ago

Ran into this error while babysitting the pipeline resets.

Traceback (most recent call last):
File "/usr/src/app/e-mission-server/emission/pipeline/intake_stage.py", line 73, in run_intake_pipeline
run_intake_pipeline_for_user(uuid)
File "/usr/src/app/e-mission-server/emission/pipeline/intake_stage.py", line 122, in run_intake_pipeline_for_user
eaist.segment_current_trips(uuid)
File "/usr/src/app/e-mission-server/emission/analysis/intake/segmentation/trip_segmentation.py", line 94, in segment_current_trips
time_query)
File "/usr/src/app/e-mission-server/emission/analysis/intake/segmentation/trip_segmentation_methods/dwell_segmentation_dist_filter.py", line 91, in segment_into_trips
if self.has_trip_ended(lastPoint, currPoint, timeseries):
File "/usr/src/app/e-mission-server/emission/analysis/intake/segmentation/trip_segmentation_methods/dwell_segmentation_dist_filter.py", line 206, in has_trip_ended
timeseries, ongoing_motion_in_range):
File "/usr/src/app/e-mission-server/emission/analysis/intake/segmentation/trip_segmentation_methods/trip_end_detection_corner_cases.py", line 17, in is_huge_invalid_ts_offset
(filterMethod.transition_df.ts >= lastPoint.ts) &
File "/root/miniconda-4.12.0/envs/emission/lib/python3.7/site-packages/pandas/core/generic.py", line 5130, in __getattr__
return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'ts'
shankari commented 1 year ago

Related code is:

we create the transition_df

        self.transition_df = timeseries.get_data_df("statemachine/transition", time_query)
        if len(self.transition_df) > 0:
            logging.debug("self.transition_df = %s" % self.transition_df[["fmt_time", "transition"]])
        else:
            logging.debug("no transitions found. This can happen for continuous sensing")

We use it:

        if not just_ended and len(self.transition_df) > 0:
            stopped_moving_after_last = self.transition_df[(self.transition_df.ts > currPoint.ts) & (self.transition_df.transition == 2)]
            logging.debug("stopped_moving_after_last = %s" % stopped_moving_after_last[["fmt_time", "transition"]])
            if len(stopped_moving_after_last) > 0:
                logging.debug("Found %d transitions after last point, ending trip..." % len(stopped_moving_after_last))
                segmentation_points.append((curr_trip_start_point, currPoint))
                self.last_ts_processed = currPoint.metadata_write_ts
            else:
                logging.debug("Found %d transitions after last point, not ending trip..." % len(stopped_moving_after_last))

and, at the crash location

def is_huge_invalid_ts_offset(filterMethod, lastPoint, currPoint, timeseries,
                              motionInRange):
    intermediate_transitions = filterMethod.transition_df[
                                    (filterMethod.transition_df.ts >= lastPoint.ts) &
                                    (filterMethod.transition_df.ts <= currPoint.ts)]

in the other locations, we check for the transition length, here we don't.

Seems like a pretty simple fix would be that if len(filterMethod.transition_df) was 0, intermediate_transitions would also be a blank dataframe. We don't actually use the intermediate transitions, only check their length.

We also need to see why the transitions are empty, given that this is not continuous data collection, but that shouldn't prevent us from fixing this right now and increasing the robustness of the system

shankari commented 1 year ago

Tried to reproduce this, and it looks like there were no transitions after the 22nd. There are some location points after the 22nd, but they are all within a meter of the previous detected value. And it looks like the person uninstalled the app or stopped providing data after the 23rd, which is why we don't have any transitions.

So a workaround would be to simply reset the pipeline for this user to the 22nd