Open shankari opened 8 years ago
Errors in the logs. There is only one error.
Found error single positional indexer is out-of-bounds while processing trip Entry({u'dat
a': {u'distance': 110.86757169217913, u'end_place': ObjectId('57f8742688f66319fbcf2717'),
u'start_loc': {u'type': u'Point', u'coordinates': [-122.2685346, 37.8722395]}, u'end_ts'
: 1475890565.191, u'start_ts': 1475890411.122, u'start_fmt_time': u'2016-10-07T18:33:31.1
22000-07:00', u'end_loc': {u'type': u'Point', u'coordinates': [-122.2686232, 37.8712449]}
, u'source': u'DwellSegmentationTimeFilter', u'start_place': ObjectId('57f852b188f6631614
0f13d4'), u'end_fmt_time': u'2016-10-07T18:36:05.191000-07:00', u'end_local_dt': {u'hour'
: 18, u'month': 10, u'second': 5, u'weekday': 4, u'year': 2016, u'timezone': u'America/Lo
s_Angeles', u'day': 7, u'minute': 36}, u'duration': 154.06900000572205, u'start_local_dt'
: {u'hour': 18, u'month': 10, u'second': 31, u'weekday': 4, u'year': 2016, u'timezone': u
'America/Los_Angeles', u'day': 7, u'minute': 33}}, u'_id': ObjectId('57f8742688f66319fbcf
2716'), u'user_id': UUID('365fc61a-033c-49ec-8f47-6fc5fbc36151'), u'metadata': {u'write_f
mt_time': u'2016-10-07T21:20:54.879732-07:00', u'write_ts': 1475900454.879732, u'time_zon
e': u'America/Los_Angeles', u'platform': u'server', u'write_local_dt': {u'hour': 21, u'mo
nth': 10, u'second': 54, u'weekday': 4, u'year': 2016, u'timezone': u'America/Los_Angeles
', u'day': 7, u'minute': 20}, u'key': u'segmentation/raw_trip'}})
Traceback (most recent call last):
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resam
ple.py", line 101, in save_cleaned_segments_for_timeline
filtered_trip = get_filtered_trip(ts, trip)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 167, in get_filtered_trip
(filtered_section_entry, point_list) = get_filtered_section(filtered_trip_entry, section)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 270, in get_filtered_section
with_speeds_df = get_filtered_points(section, filtered_section_data)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 396, in get_filtered_points
resampled_loc_df = resample(filtered_loc_df, interval=30)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 758, in resample
tz_new = [_get_timezone(ts, tz_ranges_df) for ts in ts_new]
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 776, in _get_timezone
return sel_entry.timezone.iloc[0]
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1286, in __getitem__
return self._getitem_axis(key, axis=0)
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1573, in _getitem_axis
self._is_valid_integer(key, axis)
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1487, in _is_valid_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
Two issues with the new pipeline
task_state
2016-10-11 19:14:41,691:DEBUG:140474316433216:after running mapping function <function give_points at 0x7fc298e78b90> for task f8459f4a-b5de-47db-8ce1-cb1bded53099, new_state = None
2016-10-11 19:14:41,696:ERROR:140474316433216:While processing task f8459f4a-b5de-47db-8ce1-cb1bded53099, found error 'task_state'
This is because there is no task_state for the user entry right after the switch
There were no points awarded, ever. This is because if we had no entry, we returned None, if we got None, we read data from the current timestamp, and return None. But then on the next run, we read None and again tried to read data from the current timestamp. This means that we never would read data and establish the first set of state.
2016-10-11 21:08:20,021:DEBUG:140498152335168:Metrics response: {'last_ts_processed': None, 'result': [[]]}
2016-10-11 21:08:20,022:DEBUG:140498152335168:Returning None
2016-10-11 21:08:20,022:DEBUG:140498152335168:after running mapping function <function give_points at 0x7fc825a25c08> for task f8459f4a-b5de-47db-8ce1-cb1bded53099, new_state = None
....
2016-10-11 22:00:36,357:DEBUG:140158525577024:Metrics response: {'last_ts_processed': Non
e, 'result': [[]]}
2016-10-11 22:00:36,358:DEBUG:140158525577024:Returning None
2016-10-11 22:00:36,358:DEBUG:140158525577024:after running mapping function <function gi
ve_points at 0x7f79124d4c08> for task f8459f4a-b5de-47db-8ce1-cb1bded53099, new_state = N
one
2016-10-11 15:40:48,487:DEBUG:140735236476928:------------------------------2016-10-07T18:36:05.191000-07:00------------------------------
2016-10-11 15:40:48,492:DEBUG:140735236476928:last5MinsDistances = [ 93.67122832 97.59160352 110.86757169 67.63591944 61.4205331
21.06217581 27.26482303 29.0666801 29.0666801 13.72040883] with length 10
2016-10-11 15:40:48,496:DEBUG:140735236476928:last10PointsDistances = [ 97.59160352 110.86757169 67.63591944 61.4205331 21.06217581
27.26482303 29.0666801 29.0666801 13.72040883 0. ] with length 10, shape (10,)
2016-10-11 15:40:48,499:DEBUG:140735236476928:len(last10PointsDistances) = 10, len(last5MinsDistances) = 10
2016-10-11 15:40:48,499:DEBUG:140735236476928:last5MinsTimes.max() = 184.088999987, time_threshold = 300
2016-10-11 15:40:48,502:DEBUG:140735236476928:In range 1475890555.27 -> 1475890565.19 found no transitions
2016-10-11 15:40:48,504:DEBUG:140735236476928:Found 1 motion_activity entries in range 1475890555.27 -> 1475890565.19
2016-10-11 15:40:48,504:DEBUG:140735236476928:sample activities are [{u'user_id': UUID('f8e005f9-22a8-4ec0-8107-ce6f34d5c532'), u'_id': ObjectId('57fd493bf6858f48fbffea17'), u'data': {u'type': 4, u'confidence': 37, u'local_dt': {u'hour': 18, u'month': 10, u'second': 2, u'weekday': 4, u'year': 2016, u'timezone': u'America/Los_Angeles', u'day': 7, u'minute': 36}, u'ts': 1475890562.528, u'fmt_time': u'2016-10-07T18:36:02.528000-07:00'}, u'metadata': {u'write_fmt_time': u'2016-10-07T18:36:02.528000-07:00', u'write_ts': 1475890562.528, u'time_zone': u'America/Los_Angeles', u'platform': u'android', u'write_local_dt': {u'hour': 18, u'month': 10, u'second': 2, u'weekday': 4, u'year': 2016, u'timezone': u'America/Los_Angeles', u'day': 7, u'minute': 36}, u'key': u'background/motion_activity', u'read_ts': 0, u'type': u'sensor-data'}}]
2016-10-11 15:40:48,505:DEBUG:140735236476928:prev_point.ts = 1475890555.27, curr_point.ts = 1475890565.19, time gap = 9.92100000381 (vs 300), distance_gap = 13.7204088326 (vs 100), speed_gap = 1.38296631664 (vs 0.333333333333) continuing trip
2016-10-11 15:40:48,505:DEBUG:140735236476928:Too few points to make a decision, continuing
2016-10-11 15:40:48,677:DEBUG:140735236476928:------------------------------2016-10-07T18:38:02-07:00------------------------------
2016-10-11 15:40:48,684:DEBUG:140735236476928:last5MinsDistances = [ 77.69906784 86.01345884 42.94160414 47.31793066 45.32683329
51.93555911 48.86130476 48.86130476 37.26272483 24.85742633
37.89416763 28.2278195 24.94993399 16.60382664 22.57186772
22.814868 25.14761009 16.60382664] with length 18
2016-10-11 15:40:48,688:DEBUG:140735236476928:last10PointsDistances = [ 24.85742633 37.89416763 28.2278195 24.94993399 16.60382664
22.57186772 22.814868 25.14761009 16.60382664 0. ] with length 10, shape (10,)
2016-10-11 15:40:48,693:DEBUG:140735236476928:len(last10PointsDistances) = 10, len(last5MinsDistances) = 18
2016-10-11 15:40:48,694:DEBUG:140735236476928:last5MinsTimes.max() = 296.0, time_threshold = 300
2016-10-11 15:40:48,699:DEBUG:140735236476928:prev_point.ts = 1475890675.9, curr_point.ts = 1475890682.0, time gap = 6.09599995613 (vs 300), distance_gap = 16.6038266448 (vs 100), speed_gap = 2.72372486291 (vs 0.333333333333) continuing trip
2016-10-11 15:40:48,701:DEBUG:140735236476928:Appending last_trip_end_point AttrDict({u'local_dt_hour': 18, u'local_dt_timezone': u'America/Los_Angeles', u'local_dt_second': 55, u'local_dt_year': 2016, u'local_dt_month': 10, u'local_dt_weekday': 4, u'elapsedRealtimeNanos': 533542207000000, u'loc': {u'type': u'Point', u'coordinates': [-122.2687119, 37.8711433]}, 'user_id': UUID('f8e005f9-22a8-4ec0-8107-ce6f34d5c532'), u'altitude': 0.0, u'ts': 1475890555.27, u'latitude': 37.8711433, u'sensed_speed': 0.0, u'accuracy': 23.359000000000002, u'fmt_time': u'2016-10-07T18:35:55.270000-07:00', u'local_dt_minute': 35, 'metadata_write_ts': 1475890555.45, u'longitude': -122.2687119, u'filter': u'time', '_id': ObjectId('57fd493bf6858f48fbffea16'), u'heading': 0, u'local_dt_day': 7}) with index 2776
2016-10-11 15:40:48,701:INFO:140735236476928:Found trip end at 2016-10-07T18:35:55.270000-07:00
2016-10-11 15:40:51,576:DEBUG:140735236476928:Inserting entry Entry({'data': Rawtrip({'end_place': ObjectId('57fd6a73f6858f4ee61cc5c3')
'start_ts': 1475890381.102
'start_place': ObjectId('57fd6a73f6858f4ee61cc5c1')
'duration': 174.16799998283386
'distance': 100.35387353364963
'start_loc': {"coordinates": [-122.2691549 37.8719753] "type": "Point"}
'end_ts': 1475890555.27
'start_fmt_time': u'2016-10-07T18:33:01.102000-07:00'
'end_loc': {"coordinates": [-122.2687119 37.8711433] "type": "Point"}
'source': 'DwellSegmentationTimeFilter'
'end_fmt_time': u'2016-10-07T18:35:55.270000-07:00'
'_id': ObjectId('57fd6a73f6858f4ee61cc5c2')
'user_id': UUID('f8e005f9-22a8-4ec0-8107-ce6f34d5c532')
'key': 'segmentation/raw_trip'})}) into timeseries
2016-10-11 15:40:52,804:DEBUG:140735236476928:Considering trip 57fd6a73f6858f4ee61cc5c2: 2016-10-07T18:33:01.102000-07:00 -> 2016-10-07T18:35:55.270000-07:00
2016-10-11 15:40:54,426:DEBUG:140735236476928:Filtering trip Entry({u'data': {u'distance': 100.35387353364963
u'end_place': ObjectId('57fd6a73f6858f4ee61cc5c3')
u'end_ts': 1475890555.27
u'start_ts': 1475890381.102
u'start_fmt_time': u'2016-10-07T18:33:01.102000-07:00'
u'source': u'DwellSegmentationTimeFilter'
u'start_place': ObjectId('57fd6a73f6858f4ee61cc5c1')
u'end_fmt_time': u'2016-10-07T18:35:55.270000-07:00'
u'duration': 174.16799998283386
u'_id': ObjectId('57fd6a73f6858f4ee61cc5c2')
u'user_id': UUID('f8e005f9-22a8-4ec0-8107-ce6f34d5c532')
u'key': u'segmentation/raw_trip'}})
2016-10-11 15:40:54,459:DEBUG:140735236476928:keys = (segmentation/raw_stop, segmentation/raw_section), len(places) = 0, len(trips) = 1
2016-10-11 15:40:54,464:DEBUG:140735236476928:curr_query = {'$or': [{'metadata.key': 'background/filtered_location'}], 'user_id': UUID('f8e005f9-22a8-4ec0-8107-ce6f34d5c532'), 'data.ts': {'$lte': 1475890555.27, '$gte': 1475890381.102}}, sort_key = data.ts
2016-10-11 15:40:54,468:DEBUG:140735236476928:deleting 5 points from section points
2016-10-11 15:40:54,473:DEBUG:140735236476928:Found 10 results
2016-10-11 15:40:54,479:DEBUG:140735236476928:After filtering, rows went from 10 -> 5, cols from 22 -> 21
2016-10-11 15:40:54,556:DEBUG:140735236476928:tz_ranges = [{'timezone': u'America/Los_Angeles', 'end_ts': 1475890470.5999999, 'start_ts': 1475890375.4049511}]
2016-10-11 15:40:54,559:DEBUG:140735236476928:tz_ranges_df = end_ts start_ts timezone
0 1.475890e+09 1.475890e+09 America/Los_Angeles
2016-10-11 15:40:54,560:DEBUG:140735236476928:After resampling, using 5 points from 1475890375.4 -> 1475890470.6
2016-10-11 22:02:58,676:DEBUG:140158525577024:Considering trip 57f8742688f66319fbcf2716: 2016-10-07T18:33:31.122000-07:00 -> 2016-10-07T18:36:05.191000-07:00
2016-10-11 22:02:58
780:DEBUG:140158525577024:Filtering trip Entry({u'data': {u'distance' : 110.86757169217913
u'end_place': ObjectId('57f8742688f66319fbcf2717')
u'start_loc': { u'type': u'Point' u'coordinates': [-122.2685346 37.8722395]}
u'end_ts': 1475890565.191
u'start_ts': 1475890411.122
u'start_fmt_time': u'2016-10-07T18:33:31.122000-07:00'
u' end_loc': {u'type': u'Point' u'coordinates': [-122.2686232 37.8712449]}
u'source': u'D wellSegmentationTimeFilter'
u'start_place': ObjectId('57f852b188f66316140f13d4')
u'end_ fmt_time': u'2016-10-07T18:36:05.191000-07:00'
u'_id': ObjectId('57f8742688f66319fbcf2716')
u'user_id': UUID('365fc61a-033c-49ec-8f47-6fc5fbc36151')
u'key': u'segmentation/raw_trip'}})
2016-10-11 22:02:59,396:DEBUG:140158525577024:deleting 6 points from section points
2016-10-11 22:02:59,403:DEBUG:140158525577024:Found 9 results
2016-10-11 22:02:59,412:DEBUG:140158525577024:After filtering, rows went from 9 -> 3, cols from 22 -> 21
2016-10-11 22:02:59,578:DEBUG:140158525577024:Resampling data of size 4
2016-10-11 22:02:59,586:DEBUG:140158525577024:Resampling entry list
fmt_time ts longitude latitude
0 2016-10-07T18:36:01.102000-07:00 1.475891e+09 -122.269155 37.871975
1 2016-10-07T18:33:31.122000-07:00 1.475890e+09 -122.268535 37.872239
2 2016-10-07T18:34:01-07:00 1.475890e+09 -122.268510 37.871847
3 2016-10-07T18:34:30.600000-07:00 1.475890e+09 -122.268100 37.871612 of size 4
2016-10-11 22:02:59,588:DEBUG:140158525577024:tz_ranges = [{'timezone': u'America/Los_Ang
eles', 'end_
2016-10-11 22:02:59,413:DEBUG:140158525577024:Found first section, may need to extrapolate start point
2016-10-11 22:02:59,443:DEBUG:140158525577024:Adding distance 61.8670596282 to original 88.2432234135 to extend section start from [-122.2685346, 37.8722395] to [-122.2691549, 37.8719753]
2016-10-11 22:02:59,446:DEBUG:140158525577024:Much extrapolation! new_start_ts 1475890368.86 < prev_enter_ts 1475890381.1
2016-10-11 22:02:59,446:DEBUG:140158525577024:changed new_start_ts to 1475890561.1
2016-10-11 22:02:59,448:DEBUG:140158525577024:After subtracting time 42.2597604891 from original 59.4779999256 to cover additional distance 61.8670596282 at speed 1.46397090074, new_start_ts = 1475890561.1
So basically, we tried to extrapolate to 18:32, but that was too early so we dialed it back to 18:36. But that is after the current first point (start of the raw trip) was at 18:33:31.
In [2]: arrow.get(1475890368.86).to("America/Los_Angeles")
Out[2]: <Arrow [2016-10-07T18:32:48.860000-07:00]>
In [3]: arrow.get(1475890561.1).to("America/Los_Angeles")
Out[3]: <Arrow [2016-10-07T18:36:01.100000-07:00]>
In [4]: arrow.get(1475890411.122).to("America/Los_Angeles")
Out[4]: <Arrow [2016-10-07T18:33:31.122000-07:00]>
Let us see the place before this.
2016-10-11 22:02:58
723:DEBUG:140158525577024:retrieved object Entry({u'data':
{u'enter_fmt_time': u'2016-10-07T18:33:01.102000-07:00'
u'exit_fmt_time': u'2016-10-07T18:33:31.122000-07:00'
u'ending_trip': ObjectId('57f852b188f66316140f13d3')
u'starting_trip': ObjectId('57f8742688f66319fbcf2716')
u'source': u'DwellSegmentationTimeFilter'
u'location': {u'type': u'Point' u'coordinates': [-122.2691549 37.8719753]}
u'enter_ts': 1475890381.102
u'duration': 30.019999980926514
u'exit_ts': 1475890411.122}
u'_id': ObjectId('57f852b188f66316140f13d4')
u'user_id': UUID('365fc61a-033c-49ec-8f47-6fc5fbc36151')
u'key': u'segmentation/raw_place'}}) and added to id_map
So the trip duration was not 0, but it was very short (30 secs). Why did we end up setting up a trip that was so short? While attempting to reproduce locally, we had a continuous raw trip from 18:33:01 to 18:35:55.
2016-10-11 15:40:48,335:DEBUG:140735236476928:------------------------------2016-10-07T18:33:01.102000-07:00------------------------------
2016-10-11 15:40:48,336:DEBUG:140735236476928:Setting new trip start point AttrDict({u'local_dt_hour': 18, u'local_dt_timezone': u'America/Los_Angeles', u'local_dt_second': 1, u'local_dt_year': 2016, u'local_dt_month': 10, u'local_dt_weekday': 4, u'elapsedRealtimeNanos': 533368039000000, u'loc': {u'type': u'Point', u'coordinates': [-122.2691549, 37.8719753]}, 'user_id': UUID('f8e005f9-22a8-4ec0-8107-ce6f34d5c532'), u'altitude': 0.0, u'ts': 1475890381.102, u'latitude': 37.8719753, u'sensed_speed': 0.0, u'accuracy': 19.762, u'fmt_time': u'2016-10-07T18:33:01.102000-07:00', u'local_dt_minute': 33, 'metadata_write_ts': 1475890381.305, 'idx': 2767, u'longitude': -122.2691549, u'filter': u'time', '_id': ObjectId('57fd493bf6858f48fbffe9fa'), u'heading': 0, u'local_dt_day': 7}) with idx 2767
2016-10-11 15:40:48,341:DEBUG:140735236476928:------------------------------2016-10-07T18:33:06-07:00------------------------------
2016-10-11 15:40:48,350:DEBUG:140735236476928:prev_point.ts = 1475890381.1, curr_point.ts = 1475890386.0, time gap = 4.89800000191 (vs 300), distance_gap = 4.1789205142 (vs 100), speed_gap = 0.853189161406 (vs 0.333333333333) continuing trip
2016-10-11 15:40:48,351:DEBUG:140735236476928:Too few points to make a decision, continuing
2016-10-11 15:40:48,351:DEBUG:140735236476928:------------------------------2016-10-07T18:33:31.122000-07:00------------------------------
2016-10-11 15:40:48,374:DEBUG:140735236476928:prev_point.ts = 1475890386.0, curr_point.ts = 1475890411.12, time gap = 25.121999979 (vs 300), distance_gap = 60.6238559933 (vs 100), speed_gap = 2.41317793344 (vs 0.333333333333) continuing trip
2016-10-11 15:40:48,375:DEBUG:140735236476928:Too few points to make a decision, continuing
2016-10-11 15:40:48,375:DEBUG:140735236476928:------------------------------2016-10-07T18:34:01-07:00------------------------------
2016-10-11 15:40:48,387:DEBUG:140735236476928:prev_point.ts = 1475890411.12, curr_point.ts = 1475890441.0, time gap = 29.878000021 (vs 300), distance_gap = 43.740522603 (vs 100), speed_gap = 1.46397090074 (vs 0.333333333333) continuing trip
2016-10-11 15:40:48,387:DEBUG:140735236476928:Too few points to make a decision, continuing
2016-10-11 15:40:48,388:DEBUG:140735236476928:------------------------------2016-10-07T18:34:30.600000-07:00------------------------------
2016-10-11 15:40:48,401:DEBUG:140735236476928:prev_point.ts = 1475890441.0, curr_point.ts = 1475890470.6, time gap = 29.5999999046 (vs 300), distance_gap = 44.5027008105 (vs 100), speed_gap = 1.50346962682 (vs 0.333333333333) continuing trip
2016-10-11 15:40:48,401:DEBUG:140735236476928:Too few points to make a decision, continuing
2016-10-11 15:40:48,402:DEBUG:140735236476928:------------------------------2016-10-07T18:34:49.825000-07:00------------------------------
2016-10-11 15:40:48,416:DEBUG:140735236476928:prev_point.ts = 1475890470.6, curr_point.ts = 1475890489.83, time gap = 19.2250001431 (vs 300), distance_gap = 80.7918261915 (vs 100), speed_gap = 4.2024356614 (vs 0.333333333333) continuing trip
2016-10-11 15:40:48,416:DEBUG:140735236476928:Too few points to make a decision, continuing
2016-10-11 15:40:48,417:DEBUG:140735236476928:------------------------------2016-10-07T18:35:02-07:00------------------------------
2016-10-11 15:40:48,431:DEBUG:140735236476928:prev_point.ts = 1475890489.83, curr_point.ts = 1475890502.0, time gap = 12.1749999523 (vs 300), distance_gap = 7.2072111651 (vs 100), speed_gap = 0.591968065161 (vs 0.333333333333) continuing trip
2016-10-11 15:40:48,431:DEBUG:140735236476928:Too few points to make a decision, continuing
2016-10-11 15:40:48,432:DEBUG:140735236476928:------------------------------2016-10-07T18:35:19.871000-07:00------------------------------
2016-10-11 15:40:48,447:DEBUG:140735236476928:prev_point.ts = 1475890502.0, curr_point.ts = 1475890519.87, time gap = 17.8710000515 (vs 300), distance_gap = 19.5725854655 (vs 100), speed_gap = 1.09521489615 (vs 0.333333333333) continuing trip
2016-10-11 15:40:48,447:DEBUG:140735236476928:Too few points to make a decision, continuing
2016-10-11 15:40:48,448:DEBUG:140735236476928:------------------------------2016-10-07T18:35:35.244000-07:00------------------------------
....
And because we only check to see if the place duration is zero, we don't compare the new point with the start of the raw trip.
if raw_start_place.data.duration == 0:
...
else:
new_start_ts = raw_start_place.data.enter_ts + 3 * 60
It looks like we can do two fixes.
new_start_ts = raw_start_place.data.enter_ts + raw_start_place.data.duration / 2
This will workaround the issue
2016-10-08T04:19:08.542150+00:00**********UUID ec0291bb-bee3-40c0-9dbd-a9fNo user defined
overrides for ec0291bb-bee3-40c0-9dbd-a9fbc1db2d9b, early return
IndexError: single positional indexer is out-of-bounds
No user defined overrides for 365fc61a-033c-49ec-8f47-6fc5fbc36151, early return
No user defined overrides for 365fc61a-033c-49ec-8f47-6fc5fbc36151, early return
No user defined overrides for 365fc61a-033c-49ec-8f47-6fc5fbc36151, early return
bc1db2d9b: storing views to cache**********
2016-10-08T04:19:29.603463+00:00**********UUID 9adb6994-2be2-4b7d-a122-db1fb6b65b62: moving to long term**********
UUID('365fc61a-033c-49ec-8f47-6fc5fbc36151'), UUID('2d60d46f-a9b6-4d4b-ad58-5449bd2da41b'), UUID('2f491eaf-bdb6-4746-9ccd-82f5bde935dc'), UUID('5f1b3054-4da2-4d69-ab68-1548dfbf04ee')]
2016-10-08 04:00:09,227:INFO:139756977829696:Created process <Process(Process-3, initial)> to process 2 list of size 80
Looking in the logs for process 2 for the appropriate day, we find that this was the first point, so it was the start of a trip. But why wasn't this merged with the previous partially completed trip.
2016-10-08 04:20:53,409:INFO:139756977829696:Last ts processed = None
2016-10-08 04:20:53,410:DEBUG:139756977829696:------------------------------2016-10-07T18:33:31.122000-07:00------------------------------
2016-10-08 04:20:53,410:DEBUG:139756977829696:Appending currPoint because the current start point is None
2016-10-08 04:20:53
410:DEBUG:139756977829696:Setting new trip start point AttrDict({
u'loc': {u'type': u'Point' u'coordinates': [-122.2685346 37.8722395]}
'user_id': UUID('365fc61a-033c-49ec-8f47-6fc5fbc36151')
u'ts': 1475890411.122
u'latitude': 37.8722395
u'accuracy': 24.683
u'fmt_time': u'2016-10-07T18:33:31.122000-07:00'
u'longitude': -122.2685346
u'filter': u'time'
We started looking at points from the last processed timestamp
2016-10-08 04:20:53,096:INFO:139756977829696:For stage PipelineStages.TRIP_SEGMENTATION, start_ts = 2016-10-08T01:33:11.660000
2016-10-08 04:20:53,098:DEBUG:139756977829696:curr_query = {'$or': [{'metadata.key': 'background/filtered_location'}], 'metadata.write_ts': {'$lte': 1475900448.097118, '$gte': 1475890391.66}, 'user_id': UUID('365fc61a-033c-49ec-8f47-6fc5fbc36151')}, sort_key = metadata.write_ts
which was 18:33:11.
In [5]: arrow.get(1475890391.66).to("America/Los_Angeles")
Out[5]: <Arrow [2016-10-07T18:33:11.660000-07:00]>
We have a 5 second fuzz factor, so the last processed point should have been 18:33:06. So we now have to work backwards and see why the last_processed_ts was set to it.
2016-10-11 15:40:48,335:DEBUG:140735236476928:------------------------------2016-10-07T18:33:01.102000-07:00------------------------------
2016-10-11 15:40:48,341:DEBUG:140735236476928:------------------------------2016-10-07T18:33:06-07:00------------------------------
2016-10-11 15:40:48,351:DEBUG:140735236476928:------------------------------2016-10-07T18:33:31.122000-07:00------------------------------
Previous run was in process 1 logs
UUID('365fc61a-033c-49ec-8f47-6fc5fbc36151'), UUID('9b49bcf4-9152-4851-bfca-4a51f16b8eb9'), UUID('9adb6994-2be2-4b7d-a122-db1fb6b65b62'), UUID('9be6059c-4034-406b-9df0-4ec22175361e'), UUID('d97d6bea-8e60-4629-965a-614f148a82e7'), UUID('ee787674-ec13-4e72-939a-bc72d41c4cb2'), UUID('7161343e-551e-4213-be75-3b82e1ce2448'), UUID('0ace713f-a070-4fe0-9794-87c67850a17c')]
2016-10-08 03:00:07,021:INFO:139868991977280:Created process <Process(Process-2, initial)> to process 1 list of size 80
last_ts_processed is set to 18:33:06 + 5 secs
i2016-10-08 01:58:09,043:INFO:140142893401920:For stage PipelineStages.TRIP_SEGMENTATION, last_ts_processed = 2016-10-08T01:33:11.660000
2016-10-08 01:58:08,851:DEBUG:140142893401920:------------------------------2016-10-07T18:33:01.102000-07:00------------------------------
2016-10-08 01:58:08
852:DEBUG:140142893401920:Setting new trip start point AttrDict({
u'loc': {u'type': u'Point' u'coordinates': [-122.2691549 37.8719753]}
'user_id': UUID('365fc61a-033c-49ec-8f47-6fc5fbc36151')
u'ts': 1475890381.102
u'fmt_time': u'2016-10-07T18:33:01.102000-07:00'
u'local_dt_minute': 33
2016-10-08 01:58:08,858:DEBUG:140142893401920:------------------------------2016-10-07T18
:33:06-07:00------------------------------
2016-10-08 01:58:08,868:DEBUG:140142893401920:prev_point.ts = 1475890381.1, curr_point.ts = 1475890386.0, time gap = 4.89800000191 (vs 300), distance_gap = 4.1789205142 (vs 100), speed_gap = 0.853189161406 (vs 0.333333333333) continuing trip
2016-10-08 01:58:08,863:DEBUG:140142893401920:len(last10PointsDistances) = 2, len(last5MinsDistances) = 1
2016-10-08 01:58:08,863:DEBUG:140142893401920:last5MinsTimes.max() = 4.89800000191, time_threshold = 300
2016-10-08 01:58:08,868:DEBUG:140142893401920:Too few points to make a decision, continuing
2016-10-08 01:58:08,870:DEBUG:140142893401920:trip end transition, so last 10 points are RangeIndex(start=324, stop=326, step=1)
2016-10-08 01:58:08,872:DEBUG:140142893401920:Found trip end at 2016-10-07T18:33:01.102000-07:00
2016-10-08 01:58:08,884:DEBUG:140142893401920:segmentation_point_list has length 3
The check that appears to be triggered is
if not just_ended and len(transition_df) > 0:
stopped_moving_after_last = transition_df[(transition_df.ts > currPoint.ts) & (transition_df.transition == 2)]
if len(stopped_moving_after_last) > 0:
(unused, last_trip_end_point) = self.get_last_trip_end_point(filtered_points_df,
last10Points_df, None)
segmentation_points.append((curr_trip_start_point, last_trip_end_point))
logging.debug("Found trip end at %s" % last_trip_end_point.fmt_time)
# We have processed everything up to the trip end by marking it as a completed trip
self.last_ts_processed = currPoint.metadata_write_ts
But we see that len(transition_df) == 0
2016-10-08 01:58:08,864:DEBUG:140142893401920:orig_ts_db_keys = ['statemachine/transition'], analysis_ts_db_keys = []
2016-10-08 01:58:08,866:DEBUG:140142893401920:In range 1475890381.1 -> 1475890386.0 found no transitions
It is because of
2016-10-11 20:36:16,152:DEBUG:140735236476928:Iterated over all points, just_ended = False, len(transition_df) = 380
Let us see why that if check was structured in that way.
Because of https://github.com/e-mission/e-mission-server/issues/352#issuecomment-240526292 Basically, we look to see if there are any transitions. But shouldn't we check to see if there is a transition after the current end?
Looking at the transitions, we see a single geofence exit and then multiple STOPPED transitions, some of which are clearly less than 5 mins apart. This is probably the mismatch issue, although it is not clear why calling STOPPED_MOVING does not turn off location tracking and create a geofence.
373 2016-10-07T18:16:30.760000-07:00 1
374 2016-10-07T18:18:50.127000-07:00 2
375 2016-10-07T18:21:55.374000-07:00 2
376 2016-10-07T18:26:38.847000-07:00 2
377 2016-10-07T18:27:36.861000-07:00 2
378 2016-10-07T18:28:57.186000-07:00 2
379 2016-10-07T18:33:06.826000-07:00 2
I'm going to see if this is the case for either my phone or one of the test phones. If not, I'm going to have to ask the person for their logs.
Spot checking through the logs, and ignoring any df that has transitions 13 and 14, we get:
2016-10-11 16:19:20,273:DEBUG:139839091201856:transition_df = fmt_time transition
0 2016-10-11T16:20:32.888000+02:00 1
1 2016-10-11T16:53:47.884000+02:00 2
2016-10-11 17:22:18,031:DEBUG:139780715370304:transition_df = fmt_time transition
0 2016-10-11T08:58:01.383000-07:00 1
1 2016-10-11T09:09:06.293000-07:00 2
2 2016-10-11T09:11:24.084000-07:00 1
3 2016-10-11T09:23:47.410000-07:00 2
4 2016-10-11T09:24:28.527000-07:00 1
5 2016-10-11T09:27:19.426000-07:00 2
2016-10-11 17:23:53,999:DEBUG:139780715370304:transition_df = fmt_time transition
0 2016-10-11T09:23:28.708000-07:00 1
1 2016-10-11T09:24:21.279000-07:00 2
2 2016-10-11T09:25:23.202000-07:00 1
3 2016-10-11T09:26:18.472000-07:00 2
4 2016-10-11T09:27:32.182000-07:00 1
5 2016-10-11T09:28:27.221000-07:00 2
6 2016-10-11T09:29:41.238000-07:00 1
7 2016-10-11T09:35:34.699000-07:00 2
Don't see a lot of such behavior. Asking for logs/repo.
Spot checking through the logs, and ignoring any df that has transitions 13 and 14, we get:
2016-10-11 16:19:20,273:DEBUG:139839091201856:transition_df = fmt_time transition
0 2016-10-11T16:20:32.888000+02:00 1
1 2016-10-11T16:53:47.884000+02:00 2
2016-10-11 17:22:18,031:DEBUG:139780715370304:transition_df = fmt_time transition
0 2016-10-11T08:58:01.383000-07:00 1
1 2016-10-11T09:09:06.293000-07:00 2
2 2016-10-11T09:11:24.084000-07:00 1
3 2016-10-11T09:23:47.410000-07:00 2
4 2016-10-11T09:24:28.527000-07:00 1
5 2016-10-11T09:27:19.426000-07:00 2
2016-10-11 17:23:53,999:DEBUG:139780715370304:transition_df = fmt_time transition
0 2016-10-11T09:23:28.708000-07:00 1
1 2016-10-11T09:24:21.279000-07:00 2
2 2016-10-11T09:25:23.202000-07:00 1
3 2016-10-11T09:26:18.472000-07:00 2
4 2016-10-11T09:27:32.182000-07:00 1
5 2016-10-11T09:28:27.221000-07:00 2
6 2016-10-11T09:29:41.238000-07:00 1
7 2016-10-11T09:35:34.699000-07:00 2
2016-10-11 17:27:29,085:DEBUG:139780715370304:transition_df =
fmt_time transition
0 2016-10-11T09:28:37.943000-07:00 1
1 2016-10-11T09:42:56.879000-07:00 2
2016-10-11 17:29:22,875:DEBUG:139780715370304:transition_df = fmt_time transition
0 2016-10-11T16:55:01.096000+02:00 1
1 2016-10-11T18:19:42.405000+02:00 2
2016-10-11 18:27:47,211:DEBUG:140283503507264:transition_df = fmt_time transition
0 2016-10-11T09:17:07.073000-07:00 1
1 2016-10-11T10:09:02.505000-07:00 2
2 2016-10-11T10:19:49.515000-07:00 1
3 2016-10-11T10:47:46.691000-07:00 2
2016-10-11 19:23:07,306:DEBUG:140474316433216:transition_df = fmt_time transition
0 2016-10-11T11:06:47.462000-07:00 1
1 2016-10-11T11:11:54.334000-07:00 2
2 2016-10-11T11:12:22.214000-07:00 1
3 2016-10-11T11:22:48.002000-07:00 2
4 2016-10-11T11:24:36.053000-07:00 1
5 2016-10-11T11:43:09.527000-07:00 2
6 2016-10-11T11:43:40.053000-07:00 1
7 2016-10-11T11:55:03.966000-07:00 2
8 2016-10-11T11:58:53.812000-07:00 1
9 2016-10-11T12:01:10.352000-07:00 2
2016-10-11 19:24:04,475:DEBUG:140474316433216:transition_df = fmt_time transition
0 2016-10-11T10:47:46.691000-07:00 2
1 2016-10-11T10:59:50.512000-07:00 1
2 2016-10-11T11:31:33.337000-07:00 2
2016-10-11 19:25:24,193:DEBUG:140474316433216:transition_df =
fmt_time transition
0 2016-10-11T09:29:43.405000-07:00 1
1 2016-10-11T09:32:52.815000-07:00 2
2 2016-10-11T10:39:10.623000-07:00 1
3 2016-10-11T10:43:14.425000-07:00 2
4 2016-10-11T10:50:49.197000-07:00 1
5 2016-10-11T10:53:58.969000-07:00 2
6 2016-10-11T11:38:24.280000-07:00 1
7 2016-10-11T11:49:14.770000-07:00 2
8 2016-10-11T11:49:41.938000-07:00 1
9 2016-10-11T12:05:20.786000-07:00 2
10 2016-10-11T12:06:02.368000-07:00 1
11 2016-10-11T12:07:17.133000-07:00 2
12 2016-10-11T12:09:22.025000-07:00 1
13 2016-10-11T12:10:33.176000-07:00 2
2016-10-11 19:26:17,679:DEBUG:140474316433216:transition_df = fmt_time transition
0 2016-10-11T10:11:28.701000-07:00 1
1 2016-10-11T10:27:20.536000-07:00 2
2 2016-10-11T11:41:39.071000-07:00 1
3 2016-10-11T11:50:35.783000-07:00 2
2016-10-11 21:23:01,655:DEBUG:140498152335168:transition_df = fmt_time transition
0 2016-10-11T12:13:33.675000-07:00 1
1 2016-10-11T12:21:51.572000-07:00 2
2 2016-10-11T12:57:23.736000-07:00 1
3 2016-10-11T13:00:30.532000-07:00 2
4 2016-10-11T14:15:49.315000-07:00 1
5 2016-10-11T14:18:52.949000-07:00 2
2016-10-11 23:27:01,490:DEBUG:139721316005696:transition_df = fmt_time transition
0 2016-10-11T14:03:18.156000-07:00 1
1 2016-10-11T14:07:21.908000-07:00 2
2 2016-10-11T14:07:46.806000-07:00 1
3 2016-10-11T14:11:49.008000-07:00 2
4 2016-10-11T14:12:08.574000-07:00 1
5 2016-10-11T14:17:35.369000-07:00 2
6 2016-10-11T14:36:33.483000-07:00 1
7 2016-10-11T14:43:02.052000-07:00 2
8 2016-10-11T14:43:15.398000-07:00 1
9 2016-10-11T14:52:59.833000-07:00 2
10 2016-10-11T15:05:02.485000-07:00 1
11 2016-10-11T15:16:18.888000-07:00 2
12 2016-10-11T15:16:39.414000-07:00 1
13 2016-10-11T15:20:39.087000-07:00 2
14 2016-10-11T15:21:03.554000-07:00 1
15 2016-10-11T15:28:23.638000-07:00 2
I'm going to implement the workaround and then ask the user for logs.
First, ensure that all production deployments have been addressed
Not yet merged
Unmerged change is around creating pipeline state even if it doesn't exist.