Closed shankari closed 1 year ago
I can't help but notice that the first place in question displays an end time of 11:59. I don't think that's a coincidence.
Is it possible that this place does not have an exit time?
If so, then it would skip endChecks
and match to any addition after its enter time.
Confirmed the above suspicion. Trying to investigate why that is the case.
@shankari I think I need you to investigate because I don't have access to the staging DB to run analysis.
It seems that this confirmed_place
has no exit_ts
, despite not being the last place. Why? Does the cleaned_place
have an exit_ts
?
At the time that we run the pipeline, in the CREATE_CONFIRMED_OBJECTS
stage, the most recent place has no exit_ts
.
We mark the stage as completed with the last place's enter_ts
.
On the next pipeline run, we query using last_processed_ts
as the start of the next query.
Do we ever go back to fill in the exit_ts
of the previous place?
so for the CLEAN_AND_RESAMPLE
stage, which is the other stage where we process a mixture of places and trips, we also mark the last_processed_ts
with the enter_ts of the last place. So at least that is consistent.
def mark_clean_resampling_done(user_id, last_section_done):
if last_section_done is None:
mark_stage_done(user_id, ps.PipelineStages.CLEAN_RESAMPLING, None)
else:
mark_stage_done(user_id, ps.PipelineStages.CLEAN_RESAMPLING,
last_section_done.data.enter_ts + END_FUZZ_AVOID_LTE)
Do we ever go back to fill in the exit_ts of the previous place?
For the cleaned place/trip, we do (in link_trip_start
)
Not sure if you do so for the confirmed places and/or the composite trips
After a pipeline stage is marked completed, the last_processed_ts
is recorded.
On the next pipeline run, the last_processed_ts
is 5 seconds later than was recorded. Why?
(In other words, why do we need END_FUZZ_AVOID_LTE
?)
Double checking this against the actual data from database. The first entry that is retrieved is
_id: "642e4556ac2aec7db6dcce56"
cleaned_trip: $oid: "642e445cac2aec7db6dcc367"
confirmed_place: $oid: "642e4515ac2aec7db6dcccf6"
end_confirmed_place: _id: {$oid: "642e4515ac2aec7db6dcccf6"}
cleaned_place: {$oid: "642e447bac2aec7db6dcc539"}
ending_trip: {$oid: "642e445cac2aec7db6dcc367"}
enter_fmt_time: "2023-04-04T18:44:26.137000-07:00"
enter_ts: 1680659066.137
key: "analysis/confirmed_place"
raw_places: [{$oid: "642e3cebac2aec7db6dc7dd9"}, , …] (22)
end_place: $oid: "642e447bac2aec7db6dcc539"
start_place: $oid: "642e447bac2aec7db6dcc538"
The related confirmed place now (in the database)
{'_id': ObjectId('642e4515ac2aec7db6dcccf6'),
'write_fmt_time': '2023-04-05T21:03:07.080342-07:00'
'enter_fmt_time': '2023-04-04T18:44:26.137000-07:00',
'raw_places': [ObjectId('642e3cebac2aec7db6dc7dd9'), ... ObjectId('642e3cf0ac2aec7db6dc7e01')],
'ending_trip': ObjectId('642e445cac2aec7db6dcc367'),
'cleaned_place': ObjectId('642e447bac2aec7db6dcc539'),
'user_input': {}, 'additions': []}}
The related cleaned place
{'_id': ObjectId('642e447bac2aec7db6dcc539'),
metadata': {'key': 'analysis/cleaned_place',
'write_fmt_time': '2023-04-05T21:03:07.080342-07:00'},
'enter_fmt_time': '2023-04-04T18:44:26.137000-07:00',
'raw_places': [ObjectId('642e3cebac2aec7db6dc7dd9'), ...ObjectId('642e3cf0ac2aec7db6dc7e03')],
'ending_trip': ObjectId('642e445cac2aec7db6dcc367'),
'starting_trip': ObjectId('642f6f0f071330001746036a'),
'exit_fmt_time': '2023-04-06T16:54:09.469119-07:00',
'duration': 166183.33211874962}}
Do we ever go back to fill in the exit_ts of the previous place?
For the cleaned place/trip, we do (in
link_trip_start
)Not sure if you do so for the confirmed places and/or the composite trips
I think to implement this properly, I need confirmed trips to have start_confirmed_place
. We should flesh out the linking between confirmed objects to the same extent that cleaned objects are linked.
Since Sebastian is already working on that task, I can collaborate with him on it to expedite the resolution of this critical issue.
(In other words, why do we need END_FUZZ_AVOID_LTE?)
I don't remember the details now but I know that I wrote a really long commit message when I did. We can look at the blame to figure it out. But I don't think this is the underlying issue.
I think to implement this properly, I need confirmed trips to have start_confirmed_place. We should flesh out the linking between confirmed objects to the same extent that cleaned objects are linked.
Yes, while processing a trip, you would need to find the place before it and "complete it" with the trip information.
@JGreenlee I don't think we actually need confirmed trips to have start_confirmed_place
. The change is even simpler; while processing confirmed objects, if the first cleaned place object has the fields filled in, copy them over.
I can then also fix the other fields (e.g. write_fmt_time
and ending_trip
which should be different between cleaned and confirmed trips.
I will fix right now and reset the pipelines before heading out for the weekend.
@shankari If I understand correctly, that approach won't work with the FUZZ
(5 seconds added) when marking the confirmed object creation stage as completed, which is why I was asking about it
Without the fuzz, the last place of pipeline run #1 should be the first place of run #2.
with the fuzz, you use timeline.fill_start_end_places
to fill in the first and last place of the timeline as needed. We use it in
emission//analysis/intake/cleaning/clean_and_resample.py
, save_cleaned_segments_for_ts
for example
there are a couple of options of dealing with this:
Let's see what we do in the CLEAN_AND_RESAMPLE
code since we know that works
in CLEAN_AND_RESAMPLE
, (save_cleaned_segments_for_timeline
) we:
what do we do in create_and_link_timeline
is (2) - iterate over trips; update the cleaned start place and create a new cleaned end place
I am pretty sure I can get this to work with approach (1) as well, and it even seems cleaner, but given the current timeframe, going with the tried and true here.
So high level pseudocode is:
one challenge with following the approach above is that we have cleaned_untracked
objects, but not confirmed_untracked
objects
I am not even sure how we can avoid confirmed_untracked
objects. What will the starting and confirmed end place of the untracked time link to? It seems cleaner to create a new confirmed_untracked
and link it to the confirmed
timeline; which will also support labels for the untracked time down the road
Until now the place links were not updated, so it was moot. But having the links not be updated was incorrect.
Let's add confirmed_untracked
and fix all the links properly.
Ok I think I am basically done EXCEPT for thinking about the user input matching for the first trip in the timeline for each run of the pipeline. Currently, the user input matching happens in the create_confirmed_entry
. But the first place in the timeline was created on the last run (and was the last place then).
Let's see how the place matching works.
ok, this is even worse. I took a trip with the test phone today and of course all the inputs got pushed up to the server, and now they have all disappeared. There is something seriously broken wrt server side trip addition processing. @sebastianbarry have you noticed this as well?
Place still ends at midnight | No matching additions | No matching additions |
---|---|---|
investigating briefly tonight
we received a bunch of data from the phone
2023-04-08 01:11:07,619:DEBUG:139892232456000:Returning multi_result.inserted_ids = [ObjectId('6430bcef80ea0c19db61646b'), ObjectId('6430bcef80ea0c19db61646c'), ObjectId('6430bcef80ea0c57f112c3b4'), ObjectId('6430bcef80ea0c57f112c3b5'), ObjectId('6430bcef80ea0c19db61646d'), ObjectId('6430bcef80ea0c19db61646e'), ObjectId('6430bcef80ea0c57f112c3b6'), ObjectId('6430bcef80ea0c57f112c3b7'), ObjectId('6430bcef80ea0c57f112c3b8'), ObjectId('6430bcef80ea0c57f112c3b9')]... of length 292
we tried to process them and got 5 user inputs
2023-04-08 01:11:07,662:DEBUG:139892232456000:finished querying values for ['manual/mode_confirm', 'manual/purpose_confirm', 'manual/replaced_mode', 'manual/trip_user_input', 'manual/trip_addition_input', 'manual/place_addition_input'], count = 5
there was a match for at least the first user input
2023-04-08 01:11:07,981:DEBUG:139892232456000:Comparing user input 1 Voluntary Work, : 2023-04-06T17:46:33.983000-07:00 -> 2023-04-06T17:47:46.695139-07:00, trip 2023-04-06T17:46:33.983000-07:00 -> 2023-04-06T17:47:46.695139-07:00, start checks are (True && True) and end checks are (True || True)
2023-04-08 01:11:07,984:DEBUG:139892232456000:sorted candidates are [{'write_fmt_time': '2023-04-06T18:17:11.788968-07:00', 'detail': '2023-04-06T17:46:33.983000-07:00'}]
2023-04-08 01:11:07,984:DEBUG:139892232456000:most recent entry is 2023-04-06T18:17:11.788968-07:00, 2023-04-06T17:46:33.983000-07:00
2023-04-08 01:11:07,985:DEBUG:139892232456000:Saving entry Entry({'_id': ObjectId('642f6f230713300017460414'),
'metadata': {'key': 'analysis/confirmed_place', 'write_fmt_time': '2023-04-06T18:17:11.788968-07:00'}, '
data': {'source': 'DwellSegmentationTimeFilter',
'enter_fmt_time': '2023-04-06T17:46:33.983000-07:00',
'exit_fmt_time': '2023-04-06T17:47:46.695139-07:00',
'additions': [Entry({'_id': ObjectId('6430bcef80ea0c57f112c3c9'),
'metadata': {'key': 'manual/place_addition_input',
]}}) into timeseries
Similarly, we save
Saving entry Entry({'_id': ObjectId('642f6f230713300017460416'),
'metadata': {'key': 'analysis/confirmed_place',
'enter_fmt_time': '2023-04-06T17:54:03.967000-07:00'
'exit_fmt_time': '2023-04-06T17:56:44.991424-07:00',
'additions': [Entry({'_id': ObjectId('6430bcef80ea0c57f112c3ca'),
and
Saving entry Entry({'_id': ObjectId('642f6f230713300017460418'),
'metadata': {'key': 'analysis/confirmed_place',
'write_fmt_time': '2023-04-06T18:17:11.792570-07:00'
'enter_fmt_time': '2023-04-06T18:07:08.985000-07:00',
'additions': [Entry({'_id': ObjectId('6430bcef80ea0c57f112c3cb'),
Ah of course; we copy over the confirmed place into the composite trip, but we don't recreate it when the confirmed place changes with the addition of matches.
edb.get_analysis_timeseries_db().find_one({"_id": boi.ObjectId("642f6f230713300017460416")})
{'_id': ObjectId('642f6f230713300017460416'),
'additions': [{'_id': ObjectId('6430bcef80ea0c57f112c3ca'),
But
>>> edb.get_analysis_timeseries_db().find_one({"data.end_confirmed_place._id": boi.ObjectId("642f6f230713300017460416")})
{'_id': ObjectId('642f6f25071330001746041c'),
'metadata': {'key': 'analysis/composite_trip',
'origin_key': 'analysis/confirmed_trip'
'end_confirmed_place': {'_id': ObjectId('642f6f230713300017460416'),
'metadata': {'key': 'analysis/confirmed_place',
'data': {'enter_fmt_time': '2023-04-06T17:54:03.967000-07:00',
'exit_fmt_time': '2023-04-06T17:56:44.991424-07:00',
duration': 161.02442359924316,
'cleaned_place': ObjectId('642f6f170713300017460402'),
'user_input': {}, 'additions': []}}}
So we will need to update the related composite trip as well when we update the confirmed place
wrt
will all the trip addition inputs match that last place on the server as well?
Certainly seems like it from the code.
if start_checks and not end_checks:
logging.debug("Handling corner case where start check matches, but end check does not")
next_entry_obj = _get_next_cleaned_timeline_entry(ts, tl_entry)
if next_entry_obj is not None:
next_entry_end = end_of(next_entry_obj)
if next_entry_end is None: # the last place will not have an exit_ts
end_checks = True # so we will just skip the end check
else:
Let's test it, and then finish up that fix
While testing, found another issue. In https://github.com/shankari/e-mission-server/blob/add_trip_place_additions/emission/analysis/plotting/composite_trip_creation.py#L17, we added a hack to "fill in" the confirmed place for confirmed trips.
The hack was arguably incorrect to begin with and is now even more incorrect.
confirmed_place
key in confirmed_trip
https://github.com/shankari/e-mission-server/blob/add_trip_place_additions/emission/core/wrapper/confirmedtrip.py so we shouldn't have tried to use it in the first placestart_place
and end_place
should be filled in with the correct objects for the appropriate timeline. For example, raw trips will have raw start/end places, clean trips will have clean start/end places and confirmed trips will have confirmed start/end placesHowever, there was also a weirdness in confirmed_trip
from the previous implementation in that we created confirmed_trips by copying over from the corresponding cleaned_trip
, so the start_place
and the end_place
are filled in, but point to the corresponding cleaned trips instead.
>>> edb.get_analysis_timeseries_db().find_one({"metadata.key": "analysis/confirmed_trip"})
{'_id': ObjectId('64341826ebf368c478d5482e'),
'metadata': {'key': 'analysis/confirmed_trip',
'data': {'start_place': ObjectId('64337da27423ef6528092f28'), 'end_place': ObjectId('64341826ebf368c478d54825'),
'user_input': {}, 'trip_addition': []}}
>>> edb.get_analysis_timeseries_db().find_one({"_id": boi.ObjectId("64341826ebf368c478d54825")})
{'_id': ObjectId('64341826ebf368c478d54825'),
'metadata': {'key': 'analysis/cleaned_place')}}
So we can change the hack to check whether the start place/end place are cleaned places and replace them with the corresponding confirmed place instead.
That will require a DB call to determine whether the place is of the correct type, though which I would like to avoid. Is there a simple check in the format of the confirmed_trip object that we can use instead?
Here's a potential hack:
additions
or trip_addition
: This was created from an old build (before place matches), so they will be cleaned placestrip_addition
but not additions
. This was created from an old build (before place matches), so they will be cleaned places. Fix and remove the trip_addition
field so we don't run into this again.additions
. This was created from a new build, should already be correct. Ignore.Testing done:
Entry({'_id': ObjectId('643427314fcc9197e202a0ad'),
'metadata': {'key': 'analysis/confirmed_place',
'data': {'enter_fmt_time': '2016-08-04T16:38:38.348000-10:00',
'user_input': {}, 'additions': []}})
END 2023-04-10 22:24:27.100300 POST /usercache/put 2f012dd4-7b47-43aa-b38f-3d0c6d6e8f3c 4.096529960632324
>>> last_confirmed_place = esdp.get_last_place_entry("analysis/confirmed_place", test_uuid)
>>> last_confirmed_place
Entry({'_id': ObjectId('643427314fcc9197e202a0ad'),
'metadata': {'key': 'analysis/confirmed_place',
'enter_fmt_time': '2016-08-04T16:38:38.348000-10:00',
'additions': [...]
>>> len(last_confirmed_place['data']['additions'])
5
The original last place now has an exit_ts
{'_id': ObjectId('643427314fcc9197e202a0ad'),
'metadata': {'key': 'analysis/confirmed_place',
'enter_fmt_time': '2016-08-04T16:38:38.348000-10:00',
'exit_fmt_time': '2016-08-04T16:38:38.348000-10:00',
starting_trip: ObjectId('6434f12ba6c4a675cc9eb77f')
But it still has 5 additions
>>> len(orig_last_place['data']['additions'])
5
New last confirmed place does not have exit_ts, does not have additions
>>> last_confirmed_place = esdp.get_last_place_entry("analysis/confirmed_place", test_uuid)
Entry({'_id': ObjectId('6434f12ba6c4a675cc9eb784'),
'metadata': {'key': 'analysis/confirmed_place',
'enter_fmt_time': '2016-08-05T17:25:52.895000-07:00',
'additions': []
The starting place for that original last trip was some untracked time, and it matches one of the inputs. Note that this input has not been removed from the previous last confirmed place.
{'_id': ObjectId('6434f12ba6c4a675cc9eb77f')
'metadata': {'key': 'analysis/confirmed_untracked',
'start_fmt_time': '2016-08-04T16:38:38.348000-10:00'
'end_fmt_time': '2016-08-05T04:53:24.886000-07:00'
additions: [{'metadata': 'key': 'manual/place_addition_input',}]
The end place for the untracked time is
'_id': ObjectId('6434f12ba6c4a675cc9eb780'),
'metadata': {'key': 'analysis/confirmed_place',
'enter_fmt_time': '2016-08-05T04:53:24.886000-07:00'
'exit_fmt_time': '2016-08-05T05:11:41.493000-07:00'
'duration': 1096.6070001125336
'starting_trip': ObjectId('6434f12ba6c4a675cc9eb781'),
'additions': []
The next trip is below. Seems like it should have matched at least the breakfast "personal care"; probably didn't match because it was a place input.
{'_id': ObjectId('6434f12ba6c4a675cc9eb781'),
'end_fmt_time': '2016-08-05T08:48:09-07:00'
'start_fmt_time': '2016-08-05T05:11:41.493000-07:00'
'end_place': ObjectId('6434f12ba6c4a675cc9eb782'),
'additions': []
The next place also doesn't match any additions because the place addition starts an hour before the place starts.
{'_id': ObjectId('6434f12ba6c4a675cc9eb782'),
'metadata': {'key': 'analysis/confirmed_place'
'enter_fmt_time': '2016-08-05T08:48:09-07:00'
'exit_fmt_time': '2016-08-05T17:21:35.725313-07:00'
'additions': []
Final UI screenshots:
All entries are still here, except that the display time on the last two entries has changed due to the timezone change | Duplicate entries in a later place | Untracked time overlaps with the next place |
---|---|---|
Pending issues/bugs:
Matching-related questions:
Thanks to a suggestion from @JGreenlee, changed the expected key for untracked time to fix the overlap.
Also added the enketo-notes-list
to the untracked item directive and it automagically worked after changing all the variables passed in
<enketo-notes-list ng-if="triplike.additionsList.length" timeline-entry="triplike" addition-entries="triplike.additionsList"></enketo-notes-list>
Gives me hope for the more modular future!
Next, we need to fix https://github.com/e-mission/e-mission-docs/issues/880#issuecomment-1500799602 Not sure why that didn't show up in the previous reproduction when we reload from the UI, the composite trip should have lost the matching places.
Let's see why that didn't happen.
We have three inputs
>>> all_inputs = list(ts.find_entries(["manual/place_addition_input"]))
>>> len(all_inputs)
3
They are matched to the places
>>> all_places = list(ts.find_entries(["analysis/confirmed_place"]))
>>> pd.json_normalize(all_places)["data.additions"]
0 []
1 []
2 []
3 []
4 []
5 []
6 [{'_id': 64358bf0f5622167bcba53c8, 'user_id': ...
7 [{'_id': 64358bf0f5622167bcba533e, 'user_id': ...
8 [{'_id': 64358bf0f5622167bcba527c, 'user_id': ...
9 []
But don't show up in the confirmed trips
>>> all_ct = list(ts.find_entries(["analysis/composite_trip"]))
>>> pd.json_normalize(all_ct)["data.end_confirmed_place.data.additions"]
0 []
1 []
2 []
3 []
4 []
5 []
6 []
7 []
8 []
But they do still show up on reload in the UI Let's see why
It's because the manual result map has three entries. And that is because we are retrieving them remotely.
[Log] DEBUG:About to dedup localResult = 0remoteResult = 3 (cordova.js, line 1413)
[Log] DEBUG:Deduped list = 3 (cordova.js, line 1413)
Why are we still retrieving them remotely although they have been processed? Ah, I think it is because of the mismatch between the time the data was collected and the time that it was labeled. We collected the data in 2016, so the pipeline range is
{'user_id': UUID('3b5121f7-32b0-41c1-ae63-c7e2fd5e3e43'), '$or': [{'metadata.key': 'manual/place_addition_input'}], 'metadata.write_ts': {'$lte': 1681231955, '$gte': 1470364708.348}}
which is from 2016 to 2023
>>> arrow.get(1470364708.348).to("America/Los_Angeles")
<Arrow [2016-08-04T19:38:28.348000-07:00]>
>>> arrow.get(1681231955).to("America/Los_Angeles")
<Arrow [2023-04-11T09:52:35-07:00]>
In a normal pipeline, the pipeline would move ahead, so we would not keep getting the entries. That is when they will disappear.
To fix https://github.com/e-mission/e-mission-docs/issues/880#issuecomment-1500799602, we need to ensure that whenever we update the confirmed_place object, we should also update the corresponding composite trip (f any). This should happen for any update, not just the user input or addtion; when we update the timestamps of the last place, that should also be reflected in the composite trip, for example.
Before running the pipeline a second time
>>> pd.json_normalize(all_ct)["data.end_confirmed_place.data.exit_fmt_time"]
0 2016-08-04T10:41:32.136385-10:00
1 2016-08-04T13:10:38.739684-10:00
2 2016-08-04T13:40:36.959000-10:00
3 2016-08-04T13:46:26.561801-10:00
4 2016-08-04T14:06:52.592000-10:00
5 2016-08-04T14:18:35.840464-10:00
6 2016-08-04T14:39:38.288795-10:00
7 2016-08-04T16:34:45.744782-10:00
8 NaN
Load data from the 5th, re-run pipeline
>>> all_places = list(ts.find_entries(["analysis/confirmed_place"]))
>>> pd.json_normalize(all_places)["data.exit_fmt_time"]
0 2016-08-04T10:03:51.235000-10:00
1 2016-08-04T10:41:32.136385-10:00
2 2016-08-04T13:10:38.739684-10:00
3 2016-08-04T13:40:36.959000-10:00
4 2016-08-04T13:46:26.561801-10:00
5 2016-08-04T14:06:52.592000-10:00
6 2016-08-04T14:18:35.840464-10:00
7 2016-08-04T14:39:38.288795-10:00
8 2016-08-04T16:34:45.744782-10:00
9 2016-08-04T16:38:38.348000-10:00
10 2016-08-05T05:11:41.493000-07:00
11 2016-08-05T17:21:35.725313-07:00
12 NaN
>>> pd.json_normalize(all_ct)["data.end_confirmed_place.data.exit_fmt_time"]
0 2016-08-04T10:41:32.136385-10:00
1 2016-08-04T13:10:38.739684-10:00
2 2016-08-04T13:40:36.959000-10:00
3 2016-08-04T13:46:26.561801-10:00
4 2016-08-04T14:06:52.592000-10:00
5 2016-08-04T14:18:35.840464-10:00
6 2016-08-04T14:39:38.288795-10:00
7 2016-08-04T16:34:45.744782-10:00
8 NaN
9 NaN
10 2016-08-05T17:21:35.725313-07:00
11 NaN
While writing the automated tests, I ran into the issue that the trips don't seem to show inputs (either user_input
or additions
).
I first thought that this was a server issue, but everything seemed to be working fine on the server - the confirmed trips were updated, and then the composite trips were updated. But then I noticed that the timestamps on the server were different from the timestamps on the phone, aka I couldn't match up the entries saved on the server with the values displayed on the phone.
So I retrieved the data and added a breakpoint and they are still not matching up!
Here is the set of trips that have matching inputs
Here's the visualization around that time frame on the phone. Note that none of the timestamps match up! There are no trips that start between 13:46 and 14:18, and in fact, there is a gap in the timeline right there.
Ok so I looked through the retrieved list Very Carefully, and it is clear that trips with inputs are not displayed.
ctList: Array (9)
0 {end_loc: {type: "Point", coordinates: [-155.0397394, 19.6218661]}, source: "DwellSegmentationTimeFilter", start_loc: {type: "Point", coordinates: [-154.9029399, 19.5461465]}, user_input: {}, duration: 1880.7650001049042, …}
1 {end_loc: {type: "Point", coordinates: [-155.9108361, 19.4228153]}, source: "DwellSegmentationTimeFilter", start_loc: {type: "Point", coordinates: [-155.0397394, 19.6218661]}, user_input: {}, duration: 8056.584614753723, …}
2 {end_loc: {type: "Point", coordinates: [-155.9109609, 19.4213896]}, source: "DwellSegmentationTimeFilter", start_loc: {type: "Point", coordinates: [-155.9108361, 19.4228153]}, user_input: {}, duration: 605.260315656662, …}
3 Object <----- displayed
_id: "64373344796eba348c2149dc"
additions: [] (0)
end_fmt_time: "2016-08-04T13:42:52.709000-10:00"
start_fmt_time: "2016-08-04T13:40:36.959000-10:00"
user_input: {}
4 Object <--------- not displayed
_id: "64373344796eba348c2149dd"
additions: [Object] (1)
end_fmt_time: "2016-08-04T13:58:52-10:00"
start_fmt_time: "2016-08-04T13:46:26.561801-10:00"
user_input: {trip_user_input: Object}
Trips before and after the set that I labeled are displayed. This is probably a UI fix...
Doh! This is the "To Label" screen, so trips with labels have moved to "All Trips". Maybe we should change this functionality for ENKETO instead of MULTILABEL.
Unit test results
No inputs | All inputs matched to last place |
---|---|
Spread out inputs 1 | Spread out inputs 2 |
Trip inputs matching 1 | Trip inputs matching 2 |
Final test is for the hack to fill in the confirmed places.
join_redirect_to_static_2023-01-25--40-39
which is from Jan 25, so fairly close to https://github.com/shankari/e-mission-server/commit/fcf5a9c57ae978428fa2e923b3253f34cc46012btrip_place_additions_master_2023_03_10_1
which is from Mar 10, so fairly close to https://github.com/shankari/e-mission-server/commit/c51aee943f82cce80960f2cfa58d1023115c1ca6So the checks are:
Checked out to the first commit and ran the pipeline
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/confirmed_trip", "user_id": self.testUUID})
9
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/confirmed_place", "user_id": self.testUUID})
0
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/composite_trip", "user_id": self.testUUID})
0
Checked to the most recent and re-ran pipeline
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/confirmed_trip", "user_id": self.testUUID})
9
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/confirmed_place", "user_id": self.testUUID})
18
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/composite_trip", "user_id": self.testUUID})
9
Why do we have 18 places? We should have 10
Ah, its because we create start and end places for each trip. But the end place of one trip is the start place of the next. We should handle that properly in the hack
Fixed by checking to see if there was a related confirmed trip before creating one
Now we have
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/confirmed_trip", "user_id": self.testUUID})
9
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/confirmed_place", "user_id": self.testUUID})
10
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/composite_trip", "user_id": self.testUUID})
9
And the output matches our test case (which runs only the new code)
>>> with open(dataFile_1+".before-user-inputs.expected_composite_trips") as expectation:
... expected_trips = json.load(expectation, object_hook = bju.object_hook)
... print(len(composite_trips), len(expected_trips))
... for i in range(len(composite_trips)):
... print(composite_trips[i]["data"]["end_confirmed_place"]["data"]["enter_fmt_time"], expected_trips[i]["data"]["end_confirmed_place"]["data"]["enter_fmt_time"])
...
9 9
2016-08-04T10:35:12-10:00 2016-08-04T10:35:12-10:00
2016-08-04T12:55:48.721000-10:00 2016-08-04T12:55:48.721000-10:00
2016-08-04T13:20:44-10:00 2016-08-04T13:20:44-10:00
2016-08-04T13:42:52.709000-10:00 2016-08-04T13:42:52.709000-10:00
2016-08-04T13:58:52-10:00 2016-08-04T13:58:52-10:00
2016-08-04T14:12:04.251000-10:00 2016-08-04T14:12:04.251000-10:00
2016-08-04T14:34:12.571000-10:00 2016-08-04T14:34:12.571000-10:00
2016-08-04T16:18:54.709000-10:00 2016-08-04T16:18:54.709000-10:00
2016-08-04T16:38:38.348000-10:00 2016-08-04T16:38:38.348000-10:00
Checked out the second commit and ran the pipeline
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/confirmed_trip", "user_id": self.testUUID})
9
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/confirmed_place", "user_id": self.testUUID})
9
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/composite_trip", "user_id": self.testUUID})
9
re-running ends up with the same values
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/confirmed_trip", "user_id": self.testUUID})
9
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/confirmed_place", "user_id": self.testUUID})
9
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/composite_trip", "user_id": self.testUUID})
9
That's because the entries have addition
instead of trip_addition
Although I checked, and the production servers do have trip_addition
Let's double check the commit by launching a container with the image
It is
$ docker run -it shankari/e-mission-server:trip_place_additions_master_2023_03_10_1 /bin/bash
# git log
commit f64f5be9d31acdd0cb5454eb08057b6b8b7c6a6b (HEAD -> add_trip_place_additions_new_master)
Merge: c806bcb4 1a374276
Author: Shankari <shankari@eecs.berkeley.edu>
Date: Fri Mar 10 20:37:00 2023 -0800
Merge branch 'master' of https://github.com/e-mission/e-mission-server into add_trip_place_additions_new_master
commit 1a374276d35acc1e07ad50915cc7cb670afb65a3 (upstream/master)
Merge: 5b839e11 4492edff
Author: shankari <shankari@eecs.berkeley.edu>
Date: Fri Mar 10 20:35:26 2023 -0800
Merge pull request #902 from swastis10/server_upgrade
Configuring PYTHON_LEGACY UUID representation in cfc_webapp.py
...
commit 9aed65d5948e61ba284e5a3ee29b4f9bc2b81290 (origin/add_trip_place_additions, add_trip_place_additions)
Author: Shankari <shankari@eecs.berkeley.edu>
Date: Thu Mar 9 18:08:29 2023 -0800
:bug: Read the match ID from 'data' instead of directly from the entry
To be consistent with what the phone is actually sending
+ also fix all the test cases
Testing done:
$ ./e-mission-py.bash emission//tests/analysisTests/userInputTests/TestUserInputFakeData.py
----------------------------------------------------------------------
Ran 6 tests in 0.196s
OK
```
This fixes
https://github.com/e-mission/e-mission-docs/issues/861
Ok so using commit 9aed65d5948e61ba284e5a3ee29b4f9bc2b81290
instead, we get
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/confirmed_trip", "user_id": self.testUUID})
9
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/confirmed_place", "user_id": self.testUUID})
0
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/composite_trip", "user_id": self.testUUID})
0
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/confirmed_place", "user_id": self.testUUID})
10
>>> edb.get_analysis_timeseries_db().count_documents({"metadata.key": "analysis/composite_trip", "user_id": self.testUUID})
9
And the resulting values are consistent
>>> with open(dataFile_1+".before-user-inputs.expected_composite_trips") as expectation:
... expected_trips = json.load(expectation, object_hook = bju.object_hook)
... print(len(composite_trips), len(expected_trips))
... for i in range(len(composite_trips)):
... print(composite_trips[i]["data"]["end_confirmed_place"]["data"]["enter_fmt_time"], expected_trips[i]["data"]["end_confirmed_place"]["data"]["enter_fmt_time"])
...
9 9
2016-08-04T10:35:12-10:00 2016-08-04T10:35:12-10:00
2016-08-04T12:55:48.721000-10:00 2016-08-04T12:55:48.721000-10:00
2016-08-04T13:20:44-10:00 2016-08-04T13:20:44-10:00
2016-08-04T13:42:52.709000-10:00 2016-08-04T13:42:52.709000-10:00
2016-08-04T13:58:52-10:00 2016-08-04T13:58:52-10:00
2016-08-04T14:12:04.251000-10:00 2016-08-04T14:12:04.251000-10:00
2016-08-04T14:34:12.571000-10:00 2016-08-04T14:34:12.571000-10:00
2016-08-04T16:18:54.709000-10:00 2016-08-04T16:18:54.709000-10:00
2016-08-04T16:38:38.348000-10:00 2016-08-04T16:38:38.348000-10:00
It's a wrap!
Checked display on both android and iOS
Created child issues:
I spent all day filling in time entries and they all disappeared when I took a trip in the evening.
Missing additions | Place additions work | Trip additions work |
---|---|---|
This is in the last 5 trips.
>>> last_5_list = list(edb.get_analysis_timeseries_db().find({"metadata.key": "analysis/composite_trip"}).sort("data.start_ts", -1).limit(5))
>>> import pandas as pd
>>> last_5_df = pd.json_normalize(last_5_list)
>>> last_5_df["data.start_fmt_time"]
0 2023-04-14T20:16:35.992728-07:00
1 2023-04-14T19:47:31.480693-07:00
2 2023-04-14T18:48:22.422325-07:00
3 2023-04-13T20:10:58.404540-07:00
4 2023-04-13T19:31:59.983000-07:00
And it does not have any additions
>>> last_5_df.iloc[3]["_id"]
ObjectId('6438d316e2fd7ac823955632')
>>> last_5_df.iloc[3]["data.start_fmt_time"]
'2023-04-13T20:10:58.404540-07:00'
>>> last_5_df.iloc[3]["data.end_fmt_time"]
'2023-04-13T20:22:52.978000-07:00'
>>> last_5_df.iloc[3]["data.end_confirmed_place.data.additions"]
[]
The related confirmed place object is as below and it doesn't have additions either. why didn't it the additions match?
>>> edb.get_analysis_timeseries_db().find_one({"_id": cpeid})
{'_id': ObjectId('6438d313e2fd7ac82395562b'),
'metadata': {'key': 'analysis/confirmed_place',
'write_fmt_time': '2023-04-13T21:14:11.429961-07:00'},
'enter_fmt_time': '2023-04-13T20:22:52.978000-07:00',
'enter_ts': 1681442572.978
'user_input': {},
'additions': [],
'exit_fmt_time': '2023-04-14T18:48:22.422325-07:00',
'exit_ts': 1681523302.4223254,
}}
Hm. there are three entries with the same enter_ts
>>> edb.get_analysis_timeseries_db().count_documents({"data.enter_ts": 1681442572.978})
3
And they are the places in the three timelines
>>> edb.get_analysis_timeseries_db().find({"data.enter_ts": 1681442572.978}).distinct("metadata.key")
['analysis/cleaned_place', 'analysis/confirmed_place', 'segmentation/raw_place']
We do find the last confirmed place and set it into the composite trip, but it has zero additions
2023-04-14T21:14:24.689-07:00 | 2023-04-15 04:14:24,689:DEBUG:140635200833344:last confirmed_place 6438d313e2fd7ac82395562b was already in database, updating with linked trip info... and 0 additions
looking further upstream...
2023-04-15 04:14:24,338:DEBUG:140635200833344:Found existing last confirmed place, setting exit information to 2023-04-14T18:48:22.422325-07:00, and trimming additions to 0
We need to find the match incoming step for it. It looks like we do save an entry with additions earlier
2023-04-15 04:12:28,649:DEBUG:140635200833344:Saving entry Entry({'_id': ObjectId('6438d313e2fd7ac82395562b'), 'user_id': UUID('9c084ef4-2f97-4196-bd37-950c17938ec6'), 'metadata': {'key': 'analysis/confirmed_place', 'platform': 'server', 'write_ts': 1681445651.4299612, 'time_zone': 'America/Los_Angeles', 'write_local_dt': {'year': 2023, 'month': 4, 'day': 13, 'hour': 21, 'minute': 14, 'second': 11, 'weekday': 3, 'timezone': 'America/Los_Angeles'}, 'write_fmt_time': '2023-04-13T21:14:11.429961-07:00'}, 'data': {'source': 'DwellSegmentationTimeFilter', 'enter_ts': 1681442572.978, 'enter_local_dt': {'year': 2023, 'month': 4, 'day': 13, 'hour': 20, 'minute': 22, 'second': 52, 'weekday': 3, 'timezone': 'America/Los_Angeles'}, 'enter_fmt_time': '2023-04-13T20:22:52.978000-07:00', 'location': {'type': 'Point', 'coordinates': [-122.0863953, 37.391031]}, 'raw_places': [ObjectId('6438d2efe2fd7ac823955574')], 'ending_trip': ObjectId('6438d313e2fd7ac82395562a'), 'cleaned_place': ObjectId('6438d30be2fd7ac823955607'), 'user_input': {}, 'additions': [Entry({'_id': ObjectId('643a1a4080ea0c46334769f1'), 'user_id': UUID('9c084ef4-2f97-4196-bd37-950c17938ec6'), 'metadata': {'key': 'manual/place_addition_input', 'platform': 'android', 'read_ts': 0, 'time_zone': 'America/Los_Angeles', 'type': 'message', 'write_ts': 1681446697.777, 'write_local_dt': {'year': 2023, 'month': 4, 'day': 13, 'hour': 21, 'minute': 31, 'second': 37, 'weekday': 3, 'timezone': 'America/Los_Angeles'}, 'write_fmt_time': '2023-04-13T21:31:37.777000-07:00'}, 'data': {'label': '1 Domestic, ', 'name': 'TimeUseSurvey', 'version': 9, 'xmlResponse': '<a88RxBtE3jwSar3cwiZTdn xmlns:jr="http://openrosa.org/javarosa" xmlns:orx="http://openrosa.org/xforms" id="a88RxBtE3jwSar3cwiZTdn">\n <start>2023-04-13T21:31:04.570-07:00</start>\n <end>2023-04-13T21:31:04.570-07:00</end>\n <group_hg4zz25>\n <Date>2023-04-13</Date>\n <Start_time>20:22:52.978-07:00</Start_time>\n <End_time>21:00:00.000-07:00</End_time>\n <Activity_Type>domestic_activities</Activity_Type>\n <Personal_Care_activities/>\n <Employment_related_a_Education_activities/>\n <Domestic_activities>preparing_meals_or_snacks</Domestic_activities>\n <Recreation_and_leisure/>\n <Voluntary_work_and_care_activities/>\n <Other/>\n </group_hg4zz25>\n <meta>\n <instanceID>uuid:2c4df962-617d-424a-839f-75f4f3147226</instanceID>\n </meta>\n </a88RxBtE3jwSar3cwiZTdn>', 'jsonDocResponse': {'a88RxBtE3jwSar3cwiZTdn': {'attr': {'xmlns:jr': 'http://openrosa.org/javarosa', 'xmlns:orx': 'http://openrosa.org/xforms', 'id': 'a88RxBtE3jwSar3cwiZTdn'}, 'start': '2023-04-13T21:31:04.570-07:00', 'end': '2023-04-13T21:31:04.570-07:00', 'group_hg4zz25': {'attr': {}, 'Date': '2023-04-13', 'Start_time': '20:22:52.978-07:00', 'End_time': '21:00:00.000-07:00', 'Activity_Type': 'domestic_activities', 'Personal_Care_activities': '', 'Employment_related_a_Education_activities': '', 'Domestic_activities': 'preparing_meals_or_snacks', 'Recreation_and_leisure': '', 'Voluntary_work_and_care_activities': '', 'Other': ''}, 'meta': {'attr': {}, 'instanceID': 'uuid:2c4df962-617d-424a-839f-75f4f3147226'}}}, 'start_ts': 1681442572.978, 'end_ts': 1681444800, 'match_id': '30427f9d-8faf-4448-bdbb-50378daaf644', 'start_local_dt': {'year': 2023, 'month': 4, 'day': 13, 'hour': 20, 'minute': 22, 'second': 52, 'weekday': 3, 'timezone': 'America/Los_Angeles'}, 'start_fmt_time': '2023-04-13T20:22:52.978000-07:00', 'end_local_dt': {'year': 2023, 'month': 4, 'day': 13, 'hour': 21, 'minute': 0, 'second': 0, 'weekday': 3, 'timezone': 'America/Los_Angeles'}, 'end_fmt_time': '2023-04-13T21:00:00-07:00'}})]}}) into timeseries
After the user input matching, we end up with
2023-04-15 04:12:34,053:DEBUG:140635200833344:Saving entry Entry({'_id': ObjectId('6438d313e2fd7ac82395562b'), 'user_id': UUID('9c084ef4-2f97-4196-bd37-950c17938ec6'), 'metadata': {'key': 'analysis/confirmed_place', 'platform': 'server', 'write_ts': 1681445651.4299612, 'time_zone': 'America/Los_Angeles', 'write_local_dt': {'year': 2023, 'month': 4, 'day': 13, 'hour': 21, 'minute': 14, 'second': 11, 'weekday': 3, 'timezone': 'America/Los_Angeles'}, 'write_fmt_time': '2023-04-13T21:14:11.429961-07:00'}, 'data': {'source': 'DwellSegmentationTimeFilter', 'enter_ts': 1681442572.978, 'enter_local_dt': {'year': 2023, 'month': 4, 'day': 13, 'hour': 20, 'minute': 22, 'second': 52, 'weekday': 3, 'timezone': 'America/Los_Angeles'}, 'enter_fmt_time': '2023-04-13T20:22:52.978000-07:00', 'location': {'type': 'Point', 'coordinates': [-122.0863953, 37.391031]}, 'raw_places': [ObjectId('6438d2efe2fd7ac823955574')], 'ending_trip': ObjectId('6438d313e2fd7ac82395562a'), 'cleaned_place': ObjectId('6438d30be2fd7ac823955607'), 'user_input': {}, 'additions': [{}, {}, {}, {}, {}, {}, {}, {}, {}, Entry()]}}) into timeseries
So why do all those additions get deleted later?
In CREATE_CONFIRMED_OBJECTS
, when we read the last place doc, it has all the additions, so it is not our read-after-write inconsistency
2023-04-15 04:14:24,088:DEBUG:140635200833344:last place doc = {'_id': ObjectId('6438d313e2fd7ac82395562b'), 'user_id': UUID('9c084ef4-2f97-4196-bd37-950c17938ec6'), 'metadata': {'key': 'analysis/confirmed_place', 'platform': 'server', 'write_ts': 1681445651.4299612, 'time_zone': 'America/Los_Angeles', 'write_local_dt': {'year': 2023, 'month': 4, 'day': 13, 'hour': 21, 'minute': 14, 'second': 11, 'weekday': 3, 'timezone': 'America/Los_Angeles'}, 'write_fmt_time': '2023-04-13T21:14:11.429961-07:00'}, 'data': {'source': 'DwellSegmentationTimeFilter', 'enter_ts': 1681442572.978, 'enter_local_dt': {'year': 2023, 'month': 4, 'day': 13, 'hour': 20, 'minute': 22, 'second': 52, 'weekday': 3, 'timezone': 'America/Los_Angeles'}, 'enter_fmt_time': '2023-04-13T20:22:52.978000-07:00', 'location': {'type': 'Point', 'coordinates': [-122.0863953, 37.391031]}, 'raw_places': [ObjectId('6438d2efe2fd7ac823955574')], 'ending_trip': ObjectId('6438d313e2fd7ac82395562a'), 'cleaned_place': ObjectId('6438d30be2fd7ac823955607'), 'user_input': {}, 'additions': [, , , , , , , , , , , , , , ]}}
We then try to find matches but there are none, so we trim the additions to zero.
2023-04-15 04:14:24,310:DEBUG:140635200833344:curr_query = {'user_id': UUID(...'), '$or': [{'metadata.key': 'manual/trip_addition_input'}, {'metadata.key': 'manual/place_addition_input'}], 'data.enter_ts': {'$lte': 1681523302.4223254, '$gte': 1681442572.978}}, sort_key = data.enter_ts
2023-04-15 04:14:24,326:DEBUG:140635200833344:finished querying values for ['manual/trip_addition_input', 'manual/place_addition_input'], count = 0
2023-04-15 04:14:24,327:DEBUG:140635200833344:orig_ts_db_matches = 0, analysis_ts_db_matches = 0
2023-04-15 04:14:24,337:DEBUG:140635200833344:in get_not_deleted_candidates, no candidates, returning []
2023-04-15 04:14:24,338:DEBUG:140635200833344:Found existing last confirmed place, setting exit information to 2023-04-14T18:48:22.422325-07:00, and trimming additions to 0
Why are there no matches? What do the place additions look like?
{'_id': ObjectId('643a1a4080ea0c46334769f1'), 'user_id': UUID('...'),
'metadata': {'key': 'manual/place_addition_input',
'data': 'meta': {'attr': {}, 'instanceID': 'uuid:2c4df962-617d-424a-839f-75f4f3147226'}}},
'start_ts': 1681442572.978, 'end_ts': 1681444800,
'match_id': '30427f9d-8faf-4448-bdbb-50378daaf644',
'start_local_dt': {'year': 2023, 'month': 4, 'day': 13, 'hour': 20, 'minute': 22, 'second': 52, 'weekday': 3, 'timezone': 'America/Los_Angeles'},
'start_fmt_time': '2023-04-13T20:22:52.978000-07:00',
'end_local_dt': {'year': 2023, 'month': 4, 'day': 13, 'hour': 21, 'minute': 0, 'second': 0, 'weekday': 3, 'timezone': 'America/Los_Angeles'},
'end_fmt_time': '2023-04-13T21:00:00-07:00'}}
It has start_ts and end_ts, not enter_ts and exit_ts.
The composite place creation code calls get_additions_for_timeline_entry_object
which in turn calls
def get_time_query_for_timeline_entry(timeline_entry):
begin_of_entry = begin_of(timeline_entry)
end_of_entry = end_of(timeline_entry)
timeType = "data.start_ts" if "start_ts" in timeline_entry.data else "data.enter_ts"
if end_of_entry is None:
# the last place (user's current place) will not have an exit_ts, so
# every input from its enter_ts onward is fair game
end_of_entry = EPOCH_MAXIMUM
return estt.TimeQuery(timeType, begin_of_entry, end_of_entry)
so since this is matching to a place, we search for enter
and exit
matches while the phone uses start/end
for all additions.
So how did this ever work (e.g. in https://github.com/e-mission/e-mission-docs/issues/880#issuecomment-1507272460)? Should re-run the test case and see if we can figure it out. It was a zero duration place anyway, so maybe we weren't expecting it to match a lot
Tomorrow:
Ok I think I understood what happened.
get_additions_for_timeline_entry_object
is only called when we have a confirmed object and need to match additions to it.
So it is typically not called when we enter values after the confirmed trip/place has already been created.
The only times it is called are:
We didn't find this before since:
Basically, the inputs always have start/end ts, and don't match the timeline object, so we should remove the timeline object specific functionality and convert everything to just search for start_ts
and end_ts
@JGreenlee @MaliheTabasi for visibility
Confirmed that the addition inputs always have only the start_ts
filled in and never the enter_ts
>>> edb.get_timeseries_db().count_documents({"metadata.key": "manual/trip_addition_input", "data.start_ts": {"$exists": True}})
111
>>> edb.get_timeseries_db().count_documents({"metadata.key": "manual/trip_addition_input", "data.enter_ts": {"$exists": True}})
0
>>> edb.get_timeseries_db().count_documents({"metadata.key": "manual/place_addition_input", "data.start_ts": {"$exists": True}})
90
>>> edb.get_timeseries_db().count_documents({"metadata.key": "manual/place_addition_input", "data.enter_ts": {"$exists": True}})
0
Reproducing the issue, we have all the inputs match the final place, but they are all deleted after we get more inputs and re-run the pipeline
All the matches with the last place | All entries gone |
---|---|
Couple of quick things to investigate before going on:
There should be matches for the trip (18:45 to 20:22)
metadata.write_fmt_time ... data.jsonDocResponse.a88RxBtE3jwSar3cwiZTdn.group_hg4zz25.Start_time
25 2023-04-13T15:18:44.595000-07:00 ... 15:29:17.206-07:00
24 2023-04-13T15:19:11.299000-07:00 ... 16:30:00.000-07:00
23 2023-04-13T15:21:20.298000-07:00 ... 17:30:00.000-07:00
22 2023-04-13T15:21:41.249000-07:00 ... 18:29:00.000-07:00
21 2023-04-13T15:22:28.623000-07:00 ... 19:29:00.000-07:00
20 2023-04-13T18:11:42.594000-07:00 ... NaN
19 2023-04-13T18:12:24.638000-07:00 ... 18:03:29.356-07:00
18 2023-04-13T18:13:10.923000-07:00 ... 18:36:59.350-07:00
17 2023-04-13T18:13:49.804000-07:00 ... 17:14:04.360-07:00
16 2023-04-13T18:14:06.490000-07:00 ... 18:00:29.356-07:00
15 2023-04-13T18:14:32.465000-07:00 ... NaN
14 2023-04-13T18:15:21.962000-07:00 ... NaN
13 2023-04-13T18:15:39.699000-07:00 ... NaN
12 2023-04-13T21:25:48.619000-07:00 ... NaN
11 2023-04-13T21:26:01.901000-07:00 ... 18:45:43.402-07:00
10 2023-04-13T21:26:24.069000-07:00 ... NaN
9 2023-04-13T21:26:42.917000-07:00 ... 18:58:47.497-07:00
8 2023-04-13T21:27:53.347000-07:00 ... 19:37:31.972-07:00
7 2023-04-13T21:28:22.735000-07:00 ... NaN
6 2023-04-13T21:29:39.628000-07:00 ... NaN
5 2023-04-13T21:30:03.779000-07:00 ... 19:21:58.659-07:00
4 2023-04-13T21:30:30.951000-07:00 ... 19:28:59.983-07:00
3 2023-04-13T21:30:50.064000-07:00 ... NaN
2 2023-04-13T21:31:02.171000-07:00 ... 20:10:58.404-07:00
1 2023-04-13T21:31:37.777000-07:00 ... 20:22:52.978-07:00
0 2023-04-13T21:32:00.367000-07:00 ... 21:00:00.000-07:00
are the earlier entries getting matched properly (e.g. trip details and the non-last-place time use)?
Yes they are. I just needed to load the entries properly in the test case: load entries; setup, load entries, setup. Not load, load, setup
@JGreenlee, please see attached screenshots