e-mission / e-mission-docs

Repository for docs and issues. If you need help, please file an issue here. Public conversations are better for open source projects than private email.
https://e-mission.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
15 stars 34 forks source link

🔙 🐛 P0: Errors while running the new pipeline on old data #898

Closed shankari closed 1 year ago

shankari commented 1 year ago

To test https://github.com/e-mission/e-mission-phone/pull/957, I loaded some "old format" data.

To see it in the new client, I had to re-run the pipeline to create composite trips. Note that this is what will happen when we upgrade the existing production systems to the new master

While running the pipeline, I get the following error

Error while creating composite objects, timestamp is unchanged
Traceback (most recent call last):
  File "emission/analysis/plotting/composite_trip_creation.py", line 115, in create_composite_objects
    last_done_ts = create_composite_trip(ts, t)
  File "emission/analysis/plotting/composite_trip_creation.py", line 40, in create_composite_trip
    assert statuscheck["curr_confirmed_place_count"] == 0
AssertionError
shankari commented 1 year ago

ok so we do have confirmed places, but only two

2023-04-26 00:09:34,213:DEBUG:140704418879104:Found {'curr_confirmed_place_count': 2, 'curr_composite_trip_count': 0} existing entries before this run

That's because we had some unprocessed data and created confirmed objects just before this step

2023-04-26 00:09:33,891:DEBUG:140704418879104:cleaned timeline has 2 places and 1 trips
2023-04-26 00:09:33,922:DEBUG:140704418879104:Iterating through 1 cleaned trips to create confirmed trips
2023-04-26 00:09:34,024:DEBUG:140704418879104:keys = (analysis/confirmed_place, analysis/confirmed_trip), len(places) = 2, len(trips) = 1

And fortunately, the start confirmed trip does have both enter and exit times

2023-04-26 00:09:34,024:DEBUG:140704418879104:to find start, comparing place 6448ce2dd226a3fcb64bcdbf with trip 6448ce2dd226a3fcb64bcdc0
2023-04-26 00:09:34,026:DEBUG:140704418879104:Starting with element of type place, id 6448ce2dd226a3fcb64bcdbf, details
Entry({'_id': ObjectId('6448ce2dd226a3fcb64bcdbf'),
'metadata': Metadata({'key': 'analysis/confirmed_place',
'enter_fmt_time': '2023-04-22T10:19:20.999886-04:00',
'exit_fmt_time': '2023-04-23T13:31:44.541360-04:00'})

So we cannot assume that there are no confirmed places in the code below. We have to check for each trip individually.

        assert "additions" not in ct["data"]
        logging.info("old-style format, converting from cleaned to confirmed")
        needs_hack = True
        assert statuscheck["curr_confirmed_place_count"] == 0
        convert_cleaned_to_confirmed(ts, ct, keys)
        ct["data"]["additions"] = []
shankari commented 1 year ago

Added

diff --git a/emission/analysis/plotting/composite_trip_creation.py b/emission/analysis/plotting/composite_trip_creation.py
index d9f7d575..7b5541a2 100644
--- a/emission/analysis/plotting/composite_trip_creation.py
+++ b/emission/analysis/plotting/composite_trip_creation.py
@@ -34,10 +34,14 @@ def create_composite_trip(ts, ct):
         convert_cleaned_to_confirmed(ts, ct, keys)
         del ct["data"]["trip_addition"]
     else:
+        import emission.core.get_database as edb
         assert "additions" not in ct["data"]
         logging.info("old-style format, converting from cleaned to confirmed")
         needs_hack = True
-        assert statuscheck["curr_confirmed_place_count"] == 0
+        existing_start_confirmed_place = edb.get_analysis_timeseries_db().find_one({"data.cleaned_place": ct["data"]["start_place"]})
+        existing_end_confirmed_place = edb.get_analysis_timeseries_db().find_one({"data.cleaned_place": ct["data"]["end_place"]})
+        assert existing_start_confirmed_place is None and existing_end_confirmed_place is None,\
+            f"for {ct['_id']} {existing_start_confirmed_place=} and {existing_end_confirmed_place=}"
         convert_cleaned_to_confirmed(ts, ct, keys)
         ct["data"]["additions"] = []

But now getting

Error while creating composite objects, timestamp is unchanged
Traceback (most recent call last):
  File "/Users/kshankar/e-mission/gis_branch_tests/emission/analysis/plotting/composite_trip_creation.py", line 119, in create_composite_objects
    last_done_ts = create_composite_trip(ts, t)
  File "/Users/kshankar/e-mission/gis_branch_tests/emission/analysis/plotting/composite_trip_creation.py", line 43, in create_composite_trip
    assert existing_start_confirmed_place is None and existing_end_confirmed_place is None,\
shankari commented 1 year ago

I'm still getting this after clearing, re-loading data and re-running the pipeline, so it is not just an artifact of the previous failed runs

Looking more carefully at place creation...

2023-04-26 11:18:59,762:INFO:140704418879104:For stage PipelineStages.CREATE_CONFIRMED_OBJECTS, start_ts = 2023-04-22T14:19:25.999886

2023-04-26 11:18:59,865:DEBUG:140704418879104:cleaned timeline has 2 places and 1 trips

2023-04-26 11:18:59,858:DEBUG:140704418879104:for trip = 64496b135ab8a40aadd1002e, adding start_place 6443ef9de109f0a38c98df2d and end_place 64496b135ab8a40aadd10027

2023-04-26 11:19:00,095:INFO:140704418879104:For stage PipelineStages.CREATE_CONFIRMED_OBJECTS, last_ts_processed = 2023-04-23T18:38:43.000002

existing_start_confirmed_place={'_id': ObjectId('64496b155ab8a40aadd10036')
'metadata': {'key': 'analysis/confirmed_place',
'cleaned_place': ObjectId('643f27a8665ea5d30f22644f'),

But 643f27a8665ea5d30f22644f is not one of the cleaned places that we processed in CREATE_CONFIRMED_OBJECTS. I bet we created it as the end place of a previous trip

Bingo!

2023-04-26 11:19:00,672:INFO:140704418879104:old-style format, converting from cleaned to confirmed
2023-04-26 11:19:01,018:DEBUG:140704418879104:Setting the start_place key to the newly created id 64496b155ab8a40aadd10035
2023-04-26 11:19:01,175:DEBUG:140704418879104:Setting the end_place key to the newly created id 64496b155ab8a4
0aadd10036
shankari commented 1 year ago

While investigating this further, ran into a weird issue

here was an existing confirmed place

AssertionError: For 6443ef9ee109f0a38c98df38 found existing_start_confirmed_place={'_id': ObjectId('64497553eb8b9db3c666b50b'), 
'metadata': {'key': 'analysis/confirmed_place',
'enter_fmt_time': '2023-04-22T10:19:20.999886-04:00',
'exit_fmt_time': '2023-04-23T13:31:44.541360-04:00'}}
'cleaned_place': ObjectId('6443ef9de109f0a38c98df2d'),
'ending_trip': ObjectId('6443ef9ce109f0a38c98df0b')
'starting_trip': ObjectId('64497553eb8b9db3c666b50c')  but no associated composite trip

On looking up the related objects, it looks like the confirmed place has a cleaned trip on one side

>>> edb.get_analysis_timeseries_db().find_one({"_id": boi.ObjectId("6443ef9ce109f0a38c98df0b")})
{'_id': ObjectId('6443ef9ce109f0a38c98df0b'),
'metadata': {'key': 'analysis/cleaned_trip', 
'start_place': ObjectId('6443473cf01764a706cdfffd'),
'end_place': ObjectId('6443ef9de109f0a38c98df2d')}}

The end_place of the cleaned trip is the same as the cleaned place of the confirmed place. Why are we setting this to cleaned place and not confirmed place?

shankari commented 1 year ago

ah, because when we were creating confirmed objects, we read the first cleaned place in the incremental timeline

2023-04-26 14:30:39,988:DEBUG:140704418879104:place id 6443ef9de109f0a38c98df2d is not in the map, searching i
2023-04-26 14:30:39,990:DEBUG:140704418879104:retrieved object 
{'_id': ObjectId('6443ef9de109f0a38c98df2d'), 
'metadata': {'key': 'analysis/cleaned_place',
'ending_trip': ObjectId('6443ef9ce109f0a38c98df0b'), 'starting_trip': ObjectId('644997fec78091492d
) and added to id_map
2023-04-26 14:30:39,991:DEBUG:140704418879104:cleaned timeline has 2 places and 1 trips

Since this was incremental, it had both an ending and a starting trip There was no last confirmed place found, so we created a new one

2023-04-26 14:30:40,023:DEBUG:140704418879104:no last confirmed place found, created place with id 64499800c78
091492d035265

and we linked the trip to the end of this confirmed place

2023-04-26 14:30:40,051:DEBUG:140704418879104:for trip 
Entry({'_id': ObjectId('64499800c78091492d035266'),
'metadata': Metadata({'key': 'analysis/confirmed_trip') start, linking to Entry({'_id':  ObjectId('64499800c78091492d035265'),
'metadata': Metadata({'key': 'analysis/confirmed_place',

But we don't link the previous confirmed trip (created in the previous round) to the ending trip

shankari commented 1 year ago

I will fix this, but it might have been faster to just reset the pipelines instead of this death of a thousand cuts

shankari commented 1 year ago

Yay! Finally ran without any errors

2023-04-26T15:09:16.482916-07:00**********UUID 1858a5d7-d255-4dbe-8f82-df5e03c25f9c: populating expectations**********
2023-04-26T15:09:16.503540-07:00**********UUID 1858a5d7-d255-4dbe-8f82-df5e03c25f9c: creating confirmed objects **********
2023-04-26T15:09:16.718601-07:00**********UUID 1858a5d7-d255-4dbe-8f82-df5e03c25f9c: creating composite objects **********
shankari commented 1 year ago

While trying to re-run from scratch, got another error

2023-04-26 16:32:31,572:ERROR:140704418879104:Error while creating composite objects, timestamp is unchanged
Traceback (most recent call last):
  File "/Users/kshankar/e-mission/gis_branch_tests/emission/analysis/plotting/composite_trip_creation.py", line 134, in create_composite_objects
    last_done_ts = create_composite_trip(ts, t)
  File "/Users/kshankar/e-mission/gis_branch_tests/emission/analysis/plotting/composite_trip_creation.py", line 48, in create_composite_trip
    assert previous_trip is not None,\
AssertionError: For 6443ef9ee109f0a38c98df38 found existing_start_confirmed_place={'_id': ObjectId('6449b47e8cb54312cd3e8f3e'),

The place was a first place and the trip was the trip that ended at it

2023-04-26 16:32:14,578:DEBUG:140704418879104:no last confirmed place found, created place with id 6449b47e8cb54312cd3e8f3e

2023-04-26 16:32:14,706:DEBUG:140704418879104:Found matching_confirmed_trip={
'_id': ObjectId('6443ef9ee109f0a38c98df38'),
'metadata': {'key': 'analysis/confirmed_trip'
'start_place': ObjectId('6443473cf01764a706cdfffd'),
'end_place': ObjectId('6443ef9de109f0a38c98df2d'),
'cleaned_trip': ObjectId('6443ef9ce109f0a38c98df0b'),
for curr_confirmed_start_place.data.ending_trip=ObjectId('6443ef9ce109f0a38c98df0b')

What is surprising is that this is the trip that is ending at 6449b47e8cb54312cd3e8f3e.

But our assertion claims that this is the starting_place.

For 6443ef9ee109f0a38c98df38 found existing_start_confirmed_place={'_id': ObjectId('6449b47e8cb54312cd3e8f3e'),
'cleaned_place': ObjectId('6443ef9de109f0a38c98df2d'),

Even though it clearly is not while querying the database

>>> edb.get_analysis_timeseries_db().find_one({"_id": boi.ObjectId("6443ef9ee109f0a38c98df38")})
'_id': ObjectId('6443ef9ee109f0a38c98df38'),
'start_place': ObjectId('6443473cf01764a706cdfffd'),
'end_place': ObjectId('6443ef9de109f0a38c98df2d'),
'cleaned_trip': ObjectId('6443ef9ce109f0a38c98df0b'),

Wait, the end_place does seem to match up with the cleaned_place of the confirmed_place though. Is this some stupid start/end lookup error?

shankari commented 1 year ago

Yes, and there was another stupid start/end lookup error. But they are all resolved now, and running the pipeline on a freshly loaded instance works without errors

2023-04-26T23:16:05.187683-07:00**********UUID 1858a5d7-d255-4dbe-8f82-df5e03c25f9c: inferring labels**********
2023-04-26T23:16:05.255804-07:00**********UUID 1858a5d7-d255-4dbe-8f82-df5e03c25f9c: populating expectations**********
2023-04-26T23:16:05.341649-07:00**********UUID 1858a5d7-d255-4dbe-8f82-df5e03c25f9c: creating confirmed objects **********
2023-04-26T23:16:05.814577-07:00**********UUID 1858a5d7-d255-4dbe-8f82-df5e03c25f9c: creating composite objects **********
2023-04-26T23:16:25.543376-07:00**********UUID 1858a5d7-d255-4dbe-8f82-df5e03c25f9c: storing views to cache**********
shankari commented 1 year ago

Ran this on the full Durham dataset and found 5 errors

2023-04-27 12:19:08,054:ERROR:140704418879104:Error while creating confirmed objects, timestamp is unchanged
2023-04-27 13:35:37,082:ERROR:140704418879104:Error while creating composite objects, timestamp is unchanged
2023-04-27 15:00:17,573:ERROR:140704418879104:Error while creating composite objects, timestamp is unchanged
2023-04-27 15:11:17,732:ERROR:140704418879104:Error while creating composite objects, timestamp is unchanged
2023-04-27 15:22:50,331:ERROR:140704418879104:Error while creating composite objects, timestamp is unchanged

Investigating further...

shankari commented 1 year ago

All of the composite object creation errors are due to database timeout

2023-04-27 13:35:37,082:ERROR:140704418879104:Error while creating composite objects, timestamp is unchanged
pymongo.errors.CursorNotFound: cursor id 111869280301504251 not found, full error: {'ok': 0.0, 'errmsg': 'cursor id 111869280301504251 not found', 'code': 43, 'codeName': 'CursorNotFound'}

2023-04-27 15:00:17,573:ERROR:140704418879104:Error while creating composite objects, timestamp is unchanged
pymongo.errors.CursorNotFound: cursor id 4860483662849025561 not found, full error: {'ok': 0.0, 'errmsg': 'cur
sor id 4860483662849025561 not found', 'code': 43, 'codeName': 'CursorNotFound'}

2023-04-27 15:11:17,732:ERROR:140704418879104:Error while creating composite objects, timestamp is unchanged
pymongo.errors.CursorNotFound: cursor id 1637737902225172962 not found, full error: {'ok': 0.0, 'errmsg': 'cur
sor id 1637737902225172962 not found', 'code': 43, 'codeName': 'CursorNotFound'}

2023-04-27 15:22:50,331:ERROR:140704418879104:Error while creating composite objects, timestamp is unchanged
pymongo.errors.CursorNotFound: cursor id 5764353970537108054 not found, full error: {'ok': 0.0, 'errmsg': 'cursor id 5764353970537108054 not found', 'code': 43, 'codeName': 'CursorNotFound'}

from the backtrace

Traceback (most recent call last):
  File "/Users/kshankar/e-mission/gis_branch_tests/emission/analysis/plotting/composite_trip_creation.py", line 133, in create_composite_objects
    for t in triplikeEntries:
  File "/Users/kshankar/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/pymongo/cursor.py", line 1248, in next
    if len(self.__data) or self._refresh():
  File "/Users/kshankar/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/pymongo/cursor.py", line 1188, in _refresh
    self.__send_message(g)
  File "/Users/kshankar/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/pymongo/cursor.py", line 1052, in __send_message
    response = client._run_operation(
  File "/Users/kshankar/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/pymongo/_csot.py", line 105, in csot_wrapper
    return func(self, *args, **kwargs)
  File "/Users/kshankar/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/pymongo/mongo_client.py", line 1330, in _run_operation
    return self._retryable_read(
  File "/Users/kshankar/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/pymongo/_csot.py", line 105, in csot_wrapper
    return func(self, *args, **kwargs)
  File "/Users/kshankar/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/pymongo/mongo_client.py", line 1448, in _retryable_read
    return func(session, server, sock_info, read_pref)
  File "/Users/kshankar/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/pymongo/mongo_client.py", line 1326, in _cmd
    return server.run_operation(
  File "/Users/kshankar/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/pymongo/server.py", line 134, in run_operation
    _check_command_response(first, sock_info.max_wire_version)
  File "/Users/kshankar/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/pymongo/helpers.py", line 179, in _check_command_response
    raise CursorNotFound(errmsg, code, response, max_wire_version)

The related code is at

        triplikeEntries = ts.find_entries([esda.CONFIRMED_TRIP_KEY, esda.CONFIRMED_UNTRACKED_KEY], time_query=time_query)
        last_done_ts = None
        count_created = 0
        for t in triplikeEntries:
            last_done_ts = create_composite_trip(ts, t)
            count_created += 1
        logging.debug("Created %d composite trips" % count_created if count_created > 0
                      else "No new triplike entries to process, no composite trips created")
        epq.mark_composite_object_creation_done(user_id, last_done_ts)

I guess if we have a lot of confirmed trips, then, since we create the composite trips before iterating, it is possible for the cursor to timeout.

A simple fix (at the expense of additional memory consumption) would be to just read all the trips into a list.

This seems like it would not result in too much memory consumption since we read all the trips to build models and to generate the public dashboard outputs.

Let's just double-check the time taken in each of the cases before making the change. Yup! The timeout seems to be ~ 10 minutes

2023-04-27 13:25:19,995:INFO:140704418879104:**********UUID ...: creating composite objects **********
2023-04-27 13:35:37,082:ERROR:140704418879104:Error while creating composite objects, timestamp is unchanged

2023-04-27 14:50:04,823:INFO:140704418879104:**********UUID ...: creating composite objects **********
2023-04-27 15:00:17,573:ERROR:140704418879104:Error while creating composite objects, timestamp is unchanged

2023-04-27 15:00:55,132:INFO:140704418879104:**********UUID ...: creating composite objects **********
2023-04-27 15:11:17,732:ERROR:140704418879104:Error while creating composite objects, timestamp is unchanged
shankari commented 1 year ago

The confirmed trip creation is more complex. It turns out that the reason is that we have a timeline with a single place only, which is the last place of the timeline (has only entry, does not have exit)

2023-04-27 12:19:08,034:DEBUG:140704418879104:keys = (analysis/cleaned_place, analysis/expected_trip), len(places) = 1, len(trips) = 0
2023-04-27 12:19:08,034:DEBUG:140704418879104:places_or_stops = [ObjectId('633fd8c5beb878b8f772434f')]
2023-04-27 12:19:08,035:DEBUG:140704418879104:trips_or_sections = []
2023-04-27 12:19:08,035:DEBUG:140704418879104:Starting with element of type place, id 633fd8c5beb878b8f772434f, details Entry({'_id': ObjectId('633fd8c5beb878b8f772434f'), 'metadata': {'key': 'analysis/cleaned_place',
'enter_fmt_time': '...',
'raw_places': [], 'ending_trip': ObjectId('633fd8b1beb878b8f772432a')}})
2023-04-27 12:19:08,035:DEBUG:140704418879104:cleaned timeline has 1 places and 0 trips

But then

2023-04-27 12:19:08,054:DEBUG:140704418879104:last place doc = None

so we try to create the confirmed place from the cleaned place

        curr_confirmed_start_place = create_confirmed_entry(ts,
            timeline.first_place(), esda.CONFIRMED_PLACE_KEY, keys)

but the timeline.first_place() is None, which leads to

2023-04-27 12:19:08,054:ERROR:140704418879104:Error while creating confirmed objects, timestamp is unchanged
Traceback (most recent call last):
  File "/Users/kshankar/e-mission/gis_branch_tests/emission/analysis/userinput/matcher.py", line 94, in create_confirmed_objects
    confirmed_tl = create_and_link_timeline(ts, timeline, last_confirmed_place)
  File "/Users/kshankar/e-mission/gis_branch_tests/emission/analysis/userinput/matcher.py", line 121, in create_and_link_timeline
    curr_confirmed_start_place = create_confirmed_entry(ts,
  File "/Users/kshankar/e-mission/gis_branch_tests/emission/analysis/userinput/matcher.py", line 194, in create_confirmed_entry
    confirmed_object_data = copy.copy(tce["data"])
TypeError: 'NoneType' object is not subscriptable

And the timeline first place is None because there are no trips, only places

    def first_place(self):
        if self.first_trip() is not None and \
            self.first_trip().data.start_place is not None:
            return self.get_object(self.first_trip().data.start_place)
        else:
            return None
shankari commented 1 year ago

There are two possible solutions for https://github.com/e-mission/e-mission-docs/issues/898#issuecomment-1526862427

  1. fix first_place so that it works even when there is only one place in the timeline

  2. skip the confirmed place creation if there is the cleaned place is None a. in this case, we will fill in the confirmed place while creating composite trips, as part of the backwards compat codebase

  3. seems a bit easier, but also seems hackier. We will remove the backwards compat logic after a while, who will create the corresponding confirmed place then? a. arguably, the next run of the pipeline will include trips so we will run then b. but what if the pipeline does not run again? What if this is the final run? If we were matching inputs to places, we would miss the inputs to this final place.

  4. seems more principled. If I call first_place, I should get the first place of the timeline, if it exists. That will also ensure that we do process all cleaned -> confirmed conversions.

Let's try (1) first and fall back to (2) if (1) doesn't work

shankari commented 1 year ago

with these fixes, a new error

2023-04-27 20:01:33,450:DEBUG:140704418879104:cleaned timeline has 1 places and 0 trips
2023-04-27 20:01:33,459:DEBUG:140704418879104:last place doc = {'_id': ObjectId('644ad15c03ce06258b305ebd'), 'metadata': {'key': 'analysis/confirmed_place',
'enter_fmt_time': '...',
'raw_places': [ObjectId('6362a8e45aa370f03d71be8d')],
'ending_trip': ObjectId('6362a97d5aa370f03d71be93'),
'cleaned_place': ObjectId('6362a9835aa370f03d71bec0'),
'user_input': {}, 'additions': []}}
2023-04-27 20:01:33,460:ERROR:140704418879104:Error while creating confirmed objects, timestamp is unchanged
Traceback (most recent call last):
  File "/Users/kshankar/e-mission/gis_branch_tests/emission/analysis/userinput/matcher.py", line 94, in create_confirmed_objects
    confirmed_tl = create_and_link_timeline(ts, timeline, last_confirmed_place)
  File "/Users/kshankar/e-mission/gis_branch_tests/emission/analysis/userinput/matcher.py", line 150, in create_and_link_timeline
    curr_confirmed_start_place["data"]["exit_local_dt"] = matching_cleaned_place.data.exit_local_dt
  File "/Users/kshankar/e-mission/gis_branch_tests/emission/core/wrapper/wrapperbase.py", line 71, in __getattr__
    raise AttributeError(
AttributeError: 'Cleanedplace' instance has no attribute 'exit_local_dt'

So the issue here is that the last cleaned place doesn't have an exit timestamp. That is should be pretty common, since most of the time (unless they take trips every hour), users will be at a final place for a while. If the pipeline runs multiple times during that time, the last place will be the place without an exit timestamp.

How do we deal with this in our sample clean code?

shankari commented 1 year ago

In the sample clean code, we are creating cleaned places from scratch (because of the squishing) and don't just copy over the raw information. So the first place code looks like

    if curr_cleaned_start_place is None:
        # If it is not present - maybe this user is getting started for the first
        # time, we create an entry based on the first trip from the timeline
        curr_cleaned_start_place = get_filtered_place(tl.first_place())
        logging.debug("no last cleaned place found, created place with id %s" % curr_cleaned_start_place.get_id())
        # We just created this place here, so lets add it to the created places
        # and insert rather than update it
        cleaned_places.append(curr_cleaned_start_place)
    else:
        logging.debug("Cleaned place %s found, using it" % curr_cleaned_start_place.get_id())

There's not much in the else

shankari commented 1 year ago

After making all the changes in
🔙 ⚡ 🐛 🧠 ✅ Backwards compat, robustness and bug-fixes for migration and incremental pipeline runs re-loaded the old data and re-ran the new pipeline

New error

Error while creating composite objects, timestamp is unchanged
Traceback (most recent call last):
  File "/Users/kshankar/e-mission/gis_branch_tests/emission/analysis/plotting/composite_trip_creation.py", line 136, in create_composite_objects
    last_done_ts = create_composite_trip(ts, t)
  File "/Users/kshankar/e-mission/gis_branch_tests/emission/analysis/plotting/composite_trip_creation.py", line 57, in create_composite_trip
    assert next_trip is not None and next_trip["metadata"]["key"] == "analysis/confirmed_trip" \
AssertionError: for 63594365a720fd515389d185 found 
existing_end_confirmed_place={'_id': ObjectId('644d9601682e00944e4ecd5a'),
'metadata': {'key': 'analysis/confirmed_place',
'data': {'enter_fmt_time': '2022-10-26T10:04:37-04:00',
'ending_trip': ObjectId('63594365a720fd515389d185'),
'cleaned_place': ObjectId('63594364a720fd515389d178'),
'user_input': {}, 'additions': []}} but next_trip=None
Error while creating composite objects, timestamp is unchanged

It seems like this composite place does not have a starting trip (so it is the end of the chain) and so the next trip should in fact be none.

We can add a fix for this, but it is not clear why it is not failing for other users? I guess these are the users that had unprocessed data, so these terminating confirmed places were created in CREATE_CONFIRMED_OBJECTS

Let's verify before fixing.

shankari commented 1 year ago

Reopening to fix https://github.com/e-mission/e-mission-docs/issues/898#issuecomment-1528914638

shankari commented 1 year ago
/var/tmp/intake_0-errors.log:2023-04-29 15:32:26,106:ERROR:140704418879104:Error while creating composite objects, timestamp is unchanged
/var/tmp/intake_0.log:2023-04-29 15:32:26,106:ERROR:140704418879104:Error while creating composite objects, timestamp is unchanged
/var/tmp/intake_1-errors.log:2023-04-29 15:30:01,268:ERROR:140704418879104:Error while creating composite objects, timestamp is unchanged
/var/tmp/intake_1-errors.log:2023-04-29 15:46:39,880:ERROR:140704418879104:Error while creating composite objects, timestamp is unchanged
/var/tmp/intake_1.log:2023-04-29 15:30:01,268:ERROR:140704418879104:Error while creating composite objects, timestamp is unchanged
/var/tmp/intake_1.log:2023-04-29 15:46:39,880:ERROR:140704418879104:Error while creating composite objects, timestamp is unchanged

Let's start with the first of these:

AssertionError: for 633fda30beb878b8f7724399
found existing_end_confirmed_place={'_id': ObjectId('644d98e55f92e1613f9f4926')
'metadata': {'key': 'analysis/confirmed_place'},
'data': {'enter_fmt_time': '2022-10-05T14:00:05.628000-04:00'
'ending_trip': ObjectId('633fda30beb878b8f7724399'),
'cleaned_place': ObjectId('633fd8c5beb878b8f772434f')
'user_input': {}, 'additions': []}} but next_trip=None

Confirmed

2023-04-29 15:23:33,090:INFO:140704418879104:For stage PipelineStages.CREATE_CONFIRMED_OBJECTS, start_ts = 2022-10-05T17:58:29.613000

2023-04-29 15:23:33,246:DEBUG:140704418879104:Considering place 633fd8c5beb878b8f772434f: 2022-10-05T14:00:05.628000-04:00 -> None
2023-04-29 15:23:33,246:DEBUG:140704418879104:keys = (analysis/cleaned_place, analysis/expected_trip), len(places) = 1, len(trips) = 0

2023-04-29 15:23:33,246:DEBUG:140704418879104:cleaned timeline has 1 places and 0 trips
2023-04-29 15:23:33,268:DEBUG:140704418879104:last place doc = None

2023-04-29 15:23:33,391:DEBUG:140704418879104:no last confirmed place found, created place with id 644d98e55f92e1613f9f4926

2023-04-29 15:23:33,450:DEBUG:140704418879104:Found matching_confirmed_trip={'_id': ObjectId('633fda30beb878b8f7724399'),
'metadata': {'key': 'analysis/confirmed_trip'
'data': {
'end_fmt_time': '2022-10-05T13:58:24.613000-04:00'
'start_fmt_time': '2022-10-05T13:52:12.304000-04:00'
'start_place': ObjectId('633fd8c5beb878b8f772434e'), 'end_place': ObjectId('633fd8c5beb878b8f772434f'),
'cleaned_trip': ObjectId('633fd8b1beb878b8f772432a'),
'user_input': {}}} for curr_confirmed_start_place.data.ending_trip=ObjectId('633fd8b1beb878b8f772432a')

2023-04-29 15:23:33,450:DEBUG:140704418879104:Creating confirmed timeline with 1 places and 0 trips

2023-04-29 15:23:33,452:DEBUG:140704418879104:Returning multi_result.inserted_ids = [ObjectId('644d98e55f92e1613f9f4926')]... of length 1

Checking the second

AssertionError: for 63594365a720fd515389d185 found existing_end_confirmed_place={'_id': ObjectId('644d9601682e00944e4ecd5a'),
'metadata': {'key': 'analysis/confirmed_place'
'data': {'enter_fmt_time': '2022-10-26T10:04:37-04:00'
'cleaned_place': ObjectId('63594364a720fd515389d178'),
'user_input': {}, 'additions': []}} but next_trip=None

Confirmed

2023-04-29 15:11:13,107:DEBUG:140704418879104:no last confirmed place found, created place with id 644d9601682e00944e4ecd5a
2023-04-29 15:11:13,170:DEBUG:140704418879104:Found matching_confirmed_trip={'_id': ObjectId('63594365a720fd515389d185')} for curr_confirmed_start_place.data.ending_trip=ObjectId('63594362a720fd515389d165')

2023-04-29 15:11:13,180:DEBUG:140704418879104:Creating confirmed timeline with 1 places and 0 trips
2023-04-29 15:11:13,182:DEBUG:140704418879104:Returning multi_result.inserted_ids = [ObjectId('644d9601682e00944e4ecd5a')]... of length 1
shankari commented 1 year ago

Re-ran the pipeline multiple times on the Durham data.

After restore, first Durham run completed without any unexpected errors. without restore, second Durham run completed without any unexpected errors. without restore, third Durham run completed without any unexpected errors

shankari commented 1 year ago

Ran the pipeline against the mm_masscec dataset as well and there were no errors. I finally declare this issue resolved.