Open shankari opened 1 year ago
One more pipeline investigation + potential fix. When we reset the pipeline, we set the raw_places array in the last cleaned place to empty, with the comment:
Note that we need to reset the raw_place array since it will be repopulated with new squished places when the timeline after the entry to this place is reconstructed
But is that actually correct? We do in fact still have the last raw place retained.
reset_ts, last_cleaned_place, last_raw_place = get_reset_ts(user_id, last_cleaned_place, is_dry_run)
# clear all analysis results after it
del_objects_after(user_id, reset_ts, is_dry_run)
# open the raw and cleaned places
reset_last_place(last_cleaned_place, is_dry_run)
reset_last_place(last_raw_place, is_dry_run)
Should we maintain at least that link?
Note that if we don't maintain that link, then running back to back resets fails because there are no raw places. And we have to implement a hack where we find that raw place and fill it in temporarily in memory.
While running with that fix, encountered another error
Traceback (most recent call last):
File "bin/monitor/reset_invalid_pipeline_states.py", line 25, in <module>
reset_all_invalid_state(args)
File "bin/monitor/reset_invalid_pipeline_states.py", line 15, in reset_all_invalid_state
epr.auto_reset(args.dry_run, args.only_calc)
File "/Users/kshankar/e-mission/nrel-db-connect/emission/pipeline/reset.py", line 357, in auto_reset
reset_user_to_ts(invalid_state['user_id'], invalid_state['reset_ts'], dry_run)
File "/Users/kshankar/e-mission/nrel-db-connect/emission/pipeline/reset.py", line 73, in reset_user_to_ts
reset_ts, last_cleaned_place, last_raw_place = get_reset_ts(user_id, last_cleaned_place, is_dry_run)
File "/Users/kshankar/e-mission/nrel-db-connect/emission/pipeline/reset.py", line 99, in get_reset_ts
ending_raw_trip = esda.get_entry(esda.RAW_TRIP_KEY, ending_trip["data"]["raw_trip"])
KeyError: 'raw_trip'
Not sure how we have a cleaned trip without a raw trip; maybe there was an error while saving the cleaned trip? We can work around by just resetting to an earlier time; but let's investigate a bit more before doing that.
Another issue with the pipeline: if sectioning steps fail, but the subsequent clean_and_resample passes, then we will end up with unknown and uncleaned trips. It didn't actually happen in this case, but it is a concern??!
Concretely:
From https://github.com/e-mission/e-mission-docs/issues/806#issuecomment-1269297364
when we delete, we get
and we delete a bunch of entries
Before the reset, the trips were
So 16:51 is an eminently fine place to end.
We will then retain this trip because the start is not greater than the reset. And then, because of the lte, the trip is returned along with the newly created ones, and we get our ever-increasing list.
I can even understand why this is not a problem for raw trips, but not sure why it is not for cleaned trips
Query to get raw trips while cleaning
Query to get cleaned trips for label inference
So I actually commented out the latter part of the test and manually printed out the raw_trip, cleaned_trip and confirmed_trip and they all have the same
Have worked around this for now in https://github.com/e-mission/e-mission-server/pull/879, but need a more principled investigation.