Closed shankari closed 1 year ago
Deleted all analysis data.
ubuntu@ip-10-203-173-119:/mnt/e-mission/e-mission-server$ ./e-mission-py.bash bin/reset_analysis_database.py
ERROR:root:nominatim not configured either, place decoding must happen on the client
Traceback (most recent call last):
File "/mnt/e-mission/e-mission-server/emission/core/our_geocoder.py", line 15, in <module>
nominatim_file = open("conf/net/ext_service/nominatim.json")
IOError: [Errno 2] No such file or directory: 'conf/net/ext_service/nominatim.json'
{u'ok': 1, u'n': 1420137}
{u'ok': 1, u'n': 561}
{u'ok': 1, u'n': 811}
{u'ok': 1, u'n': 539}
Reset all habitica users
ubuntu@ip-10-203-173-119:/mnt/e-mission/e-mission-server$ ./e-mission-py.bash bin/ext_service/reset_all_habitica_users.py
DEBUG:root:Processing emission user id 74f2a65f-ef5a-3648-8239-23d24f245f49
DEBUG:root:For user 74f2a65f-ef5a-3648-8239-23d24f245f49, about to proxy POST method /api/v3/user/reset with args {}
DEBUG:root:auth_headers = {'x-api-user': u'e26ef1c3-d45e-464a-998c-729d107ecc42', 'x-api-key': u'5669eb83-3e36-468d-95f4-6af15801d6cc'}
INFO:requests.packages.urllib3.connectionpool:Starting new HTTP connection (1): 54.159.38.241
DEBUG:requests.packages.urllib3.connectionpool:"POST /api/v3/user/reset HTTP/1.1" 200 N
oneDEBUG:root:result = <Response [200]>
DEBUG:root:reset result for 789a6ea5-7d20-3bf7-8127-b7756ffe46eb = <Response [200]>DEBUG:root:Processing emission user id 0763de67-f61e-3f5d-90e7-518e69793954
...
38.241
DEBUG:requests.packages.urllib3.connectionpool:"POST /api/v3/user/reset HTTP/1.1" 200 None
DEBUG:root:result = <Response [200]>
DEBUG:root:reset result for e471711e-bd14-3dbe-80b6-9c7d92ecc296 = <Response [200]>
Now, copy over all pending usercache stuff
First, reconfigure intake.conf
from intake.conf.sample
to send logs to our log partition instead of /var/tmp/
. Delete all existing intake logs, and change our backup count to 6 since we plan to run two pipelines in parallel.
Error while running the formatters
Traceback (most recent call last):
File "/mnt/e-mission/e-mission-server/emission/net/usercache/builtin_usercache_handler.py", line 70, in moveToLongTerm
unified_entry = enuf.convert_to_common_format(entry)
File "/mnt/e-mission/e-mission-server/emission/net/usercache/formatters/formatter.py", line 6, in convert_to_common_format
return format_fn(entry)
File "/mnt/e-mission/e-mission-server/emission/net/usercache/formatters/ios/transition.py", line 55, in format
data.transition = transition_map[entry.data.transition].value
KeyError: None
Got error None while saving entry AttrDict({u'user_id': UUID('0237572d-9cb8-3c30-96bb-8005f6106161'), u'_id': ObjectId('57ded24e0a3eb8b13019ce81'), u'data': {u'transition': None, u'currState': u'STATE_ONGOING_TRIP'}, u'metadata': {u'plugin': u'none', u'write_ts': 1474185580.054358, u'time_zone': u'America/Los_Angeles', u'platform': u'ios', u'key': u'statemachine/transition', u'read_ts': 0, u'type': u'message'}}) -> None
Seems to happen consistently even if not particularly frequently on iOS.
Got error None while saving entry AttrDict({u'user_id': UUID('04915299-f90b-3d22-9c4e-349267af5130'), u'_id': ObjectId('5769ff494a9bba50e3fb6620'), u'data': {u'transition': None, u'currState': u'STATE_ONGOING_TRIP'}, u'metadata': {u'plugin': u'none', u'write_ts': 1466562923.042579, u'time_zone': u'America/Los_Angeles', u'platform': u'ios', u'key': u'statemachine/transition', u'read_ts': 0, u'type': u'message'}}) -> None
...
Got error None while saving entry AttrDict({u'user_id': UUID('04915299-f90b-3d22-9c4e-349267af5130'), u'_id': ObjectId('576c50204a9bba50e3fc0725'), u'data': {u'transition': None, u'currState': u'STATE_ONGOING_TRIP'}, u'metadata': {u'plugin': u'none', u'write_ts': 1466714035.526686, u'time_zone': u'America/Los_Angeles', u'platform': u'ios', u'key': u'statemachine/transition', u'read_ts': 0, u'type': u'message'}}) -> None
Hit the error 184 times.
/mnt/e-mission/e-mission-server$ bunzip2 -c /mnt/logs/emission/intake-errors.log.gz | grep "KeyError" | wc -l
184
Now, moving on to running the two copies of the actual pipeline
Second pipeline (which had more recent users with little history) is done. First pipeline is still processing my data. This is why static scheduling is bad!
Also, the habitica reset didn't actually work. We did reset the users, but we didn't reset the timestamp in the database, so we only restored points from after the last time that the pipeline ran. That is so sad.
2016-09-19 14:17:49,146:DEBUG:139656908773184:curr_query = {'$or': [{'metadata.key': 'analysis/cleaned_section'}], 'user_id': UUID('789a6ea5-7d20-3bf7-8127-b7756ffe46eb'), 'data.start_ts': {'$lte': 1474294669, '$gte': 1474105338}}, sort_key = data.start_ts
2016-09-19 14:17:49,146:DEBUG:139656908773184:orig_ts_db_keys = [], analysis_ts_db_keys = ['analysis/cleaned_section']
2016-09-19 14:17:49,147:DEBUG:139656908773184:finished querying values for []
2016-09-19 14:17:49,147:DEBUG:139656908773184:finished querying values for ['analysis/cleaned_section']
2016-09-19 14:17:49,149:DEBUG:139656908773184:Found 0 results
2016-09-19 14:17:49,149:DEBUG:139656908773184:Returning entry with length 0 result
2016-09-19 14:17:49,149:INFO:139656908773184:Found no entries for user 789a6ea5-7d20-3bf7-8127-b7756ffe46eb, time_query <emission.storage.timeseries.timequery.TimeQuery object at 0x7f0468677990>
2016-09-19 14:17:49,150:DEBUG:139656908773184:Metrics response: {'last_ts_processed': None, 'result': []}
Sent out https://github.com/e-mission/e-mission-server/pull/409 for the fix - can re-run it to make sure the points are correct
Crashed with
Traceback (most recent call last):
File "bin/historical/before_bic2cal/intake_stage_1.py", line 46, in <module>
eaist.segment_current_trips(uuid)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/segmentation/trip_segmentation.py", line 74, in segment_current_trips
filter_methods)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/segmentation/trip_segmentation.py", line 108, in get_combined_segmentation_points
time_query.startTs = loc_df[loc_df["filter"] == curr_filter].head(1).iloc[0].ts
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1286, in __getitem__
return self._getitem_axis(key, axis=0)
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1573, in _getitem_axis
self._is_valid_integer(key, axis)
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1487, in _is_valid_integer
raise IndexError("single positional indexer is out-of-bounds")
This was because
2016-09-20 05:58:52,278:DEBUG:140114953025344:Filters in the dataframe = [u'time' nan]
Added a dropna() hotfix.
Need to start pipeline again, and this time, I'm manually going to set the UUID list to try and balance the length of time.
Failed for user UUID('34b64c82-2fe8-36bd-8ac3-55496692e215'). First user in pipeline 2 = UUID('789a6ea5-7d20-3bf7-8127-b7756ffe46eb')
Full intake_1 first run list =
[UUID('0763de67-f61e-3f5d-90e7-518e69793954'),
UUID('f955cff1-8fb8-3b42-8776-6d8874e5b90a'),
UUID('b0d937d0-70ef-305e-9563-440369012b39'),
UUID('6a488797-5be1-38aa-9700-5532be978ff9'),
UUID('34b64c82-2fe8-36bd-8ac3-55496692e215'),
None,
UUID('993f14f0-79f0-33fa-bdbc-049b88056fbf'),
UUID('e5edfe4c-dc67-3ef4-8d3a-7042ce3c1a13'),
UUID('8184b253-df6f-3f0b-aac4-54d65a368110'),
UUID('788f46af-9e6d-300b-93e1-981ba9b3390b'),
UUID('fa5c7f7c-3ab0-314b-989f-5042ab5cd946'),
UUID('78cf23c1-04b4-370a-801f-a86d6d7701b3'),
UUID('87b2a145-69e3-372b-8a66-801051fb11d1'),
UUID('fe08f85c-a49b-3ed4-b2d2-fa7d8c055713'),
UUID('5679434e-f352-31ea-a3cb-aa0cbda8304b'),
UUID('273efe85-937e-3622-9b34-19cb64653a9f'),
UUID('36901462-4987-3425-b263-e9bb6c0bd516'),
UUID('04915299-f90b-3d22-9c4e-349267af5130'),
UUID('f6ed12b8-ba13-3cfe-83af-d018dfd32f19'),
UUID('f7711fa2-d2bd-374e-b377-ef2ecc9afa60'),
UUID('4754956b-e415-318a-b586-a8c95751ed7b'),
UUID('d6ee41de-5796-357c-a7d1-c226e35e104d'),
UUID('9a9883f4-bde7-3ce5-beec-e1d6b8bf8f56'),
UUID('e0509a4e-9842-3c73-8050-875b6383c097'),
UUID('1a0bd056-283c-370e-8349-35aa07d936ca'),
UUID('74f2a65f-ef5a-3648-8239-23d24f245f49'),
UUID('ef015d3d-50de-3b9e-b9a5-a40554ecb18b'),
UUID('06f0779c-7957-3631-9ee5-7d59487c7758'),
UUID('d635a545-b44e-33f0-89ab-57d37359dc69'),
UUID('c39c8281-c4c0-3504-b83c-b797dd0c4f13')]
Let's split it up like so:
intake1
[UUID('34b64c82-2fe8-36bd-8ac3-55496692e215'),
UUID('e5edfe4c-dc67-3ef4-8d3a-7042ce3c1a13'),
UUID('788f46af-9e6d-300b-93e1-981ba9b3390b'),
UUID('78cf23c1-04b4-370a-801f-a86d6d7701b3'),
UUID('fe08f85c-a49b-3ed4-b2d2-fa7d8c055713'),
UUID('273efe85-937e-3622-9b34-19cb64653a9f'),
UUID('04915299-f90b-3d22-9c4e-349267af5130'),
UUID('f7711fa2-d2bd-374e-b377-ef2ecc9afa60'),
UUID('9a9883f4-bde7-3ce5-beec-e1d6b8bf8f56'),
UUID('e0509a4e-9842-3c73-8050-875b6383c097'),
UUID('1a0bd056-283c-370e-8349-35aa07d936ca'),
UUID('d635a545-b44e-33f0-89ab-57d37359dc69')]
intake2
[UUID('993f14f0-79f0-33fa-bdbc-049b88056fbf'),
UUID('8184b253-df6f-3f0b-aac4-54d65a368110'),
UUID('fa5c7f7c-3ab0-314b-989f-5042ab5cd946'),
UUID('87b2a145-69e3-372b-8a66-801051fb11d1'),
UUID('5679434e-f352-31ea-a3cb-aa0cbda8304b'),
UUID('36901462-4987-3425-b263-e9bb6c0bd516'),
UUID('f6ed12b8-ba13-3cfe-83af-d018dfd32f19'),
UUID('4754956b-e415-318a-b586-a8c95751ed7b'),
UUID('d6ee41de-5796-357c-a7d1-c226e35e104d'),
UUID('74f2a65f-ef5a-3648-8239-23d24f245f49'),
UUID('ef015d3d-50de-3b9e-b9a5-a40554ecb18b'),
UUID('06f0779c-7957-3631-9ee5-7d59487c7758'),
UUID('c39c8281-c4c0-3504-b83c-b797dd0c4f13')]
The old pipeline finished running at around noon today
-rw-rw-r-- 1 ubuntu ubuntu 780M Sep 20 19:12 intake_2.log.gz
ubuntu@ip-10-203-173-119:/mnt/e-mission/e-mission-server$ date
Tue Sep 20 23:41:10 UTC 2016
and I've merged a ton of scalability/robustness fixes in the past two days (#409, #411, #413, #414, #415, #416).
Once I pull all these to the server, I need to start doing some more fixup.
Babysit to see how it works!
So the steps are:
$ kill -9 (supervisord)
$ kill -9 (cfc_webapp.py)
$ git pull origin master
$ git checkout emission/analysis/intake/segmentation/trip_segmentation.py
$ git checkout bin/debug/load_timeline_for_day_and_user.py
$ git pull origin master
$ cp conf/log/webserver.conf.sample conf/log/webserver.conf
< edit to point to correct path >
< edit intake pipeline to store 5 backup files >
Now, while the webserver is still down and we cannot get any more incoming messages, reset the uuids
$ ./e-mission-py.bash bin/historical/before_bic2cal/reset_uuids.py 2>&1 | tee /path/to/uuid_reset.log
....
DEBUG:root:{'updatedExisting': False, u'nModified': 0, u'ok': 1, u'n': 0}
DEBUG:root:Resetting edb.get_timeseries_error_db...
DEBUG:root:{'updatedExisting': False, u'nModified': 0, u'ok': 1, u'n': 0}
DEBUG:root:Resetting edb.get_usercache_db...
DEBUG:root:{'updatedExisting': True, u'nModified': 6104, u'ok': 1, u'n': 6104}
Next, reset habitica users. this will provide some initial verification that the uuid reset worked. I currently have 58 points on habitica.
First, plain reset to ensure that it works.
$ ./e-mission-py.bash bin/ext_service/reset_all_habitica_users.py 2>&1 | tee /path/to/reset_habitica_1.log
Then reset with restore
$ ./e-mission-py.bash --restore bin/ext_service/reset_all_habitica_users.py 2>&1 | tee /path/to/reset_habitica_1.log
Worked! I am now at level 33!
Almost done, don't screw up now! Next, we want to turn on the webserver again.
$ /home/ubuntu/anaconda/bin/supervisord
$ ps -aef | grep python
/home/ubuntu/anaconda/bin/python /home/ubuntu/anaconda/bin/supervisord
/home/ubuntu/anaconda/bin/python emission/net/api/cfc_webapp.py
Check that the logs are working. Oops wrong path configured.
root 12247 12246 1 01:04 ? 00:00:01 /home/ubuntu/anaconda/bin/python emission/net/api/cfc_webapp.py
$ sudo kill -9 12247
root 12414 12413 94 01:07 ? 00:00:00 /home/ubuntu/anaconda/bin/python emission/net/api/cfc_webapp.py
Now logs are in the right location.
2016-09-21 01:07:19,353:DEBUG:139967128368960:This should go to the log file
Now, at long last, restart the intake pipeline to process data for the past two days. And then.... time to turn it back on forever...
Several issues while running the pipeline.
First, there are several people who have no entries in the timeseries database. Previously, we used to filter them out because we only looked at people who were in the long term database, but now that we look at everybody, that doesn't work any more.
In [279]: edb.get_timeseries_db().find({"user_id": UUID('eede2768-df3f-4f59-9555-445d9aee7d70')}).count()
Out[279]: 0
We should really delete such people, but I am currently really wary of deleting anything. Instead, I will add a check to the intake pipeline to weed them out.
Second, all three processes have crashed in the USERCACHE stage. Unclear why - need to investigate.
[{u'_id': ObjectId('57dfd58588f663018ccbdc77'),
u'curr_run_ts': 1474421642.952858,
u'last_processed_ts': 1458230276.237236,
u'last_ts_run': 1474286976.366866,
u'pipeline_stage': 0,
u'user_id': UUID('e9cd5530-719d-4ca3-992e-ff35636e06e8')},
{u'_id': ObjectId('57dfd8ff88f663018ccbdcbe'),
u'curr_run_ts': 1474420687.125368,
u'last_processed_ts': 1462576429.402939,
u'last_ts_run': 1474287866.051583,
u'pipeline_stage': 0,
u'user_id': UUID('7aab862f-fa38-4484-909d-1da0a95f594d')},
{u'_id': ObjectId('57e1e21988f66330f265846d'),
u'curr_run_ts': 1474421268.311108,
u'last_processed_ts': None,
u'last_ts_run': None,
u'pipeline_stage': 0,
u'user_id': UUID('99b29da2-989d-4e54-a211-284bde1d362d')}]
At least one of the crashes is
File "/home/ubuntu/anaconda/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/ubuntu/anaconda/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/mnt/e-mission/e-mission-server/emission/pipeline/intake_stage.py", line 52, in run_intake_pipeline
uh.moveToLongTerm()
File "/mnt/e-mission/e-mission-server/emission/net/usercache/builtin_usercache_handler.py", line 85, in moveToLongTerm
ts.bulk_insert(unified_entry_list, etsa.EntryType.DATA_TYPE)
File "/mnt/e-mission/e-mission-server/emission/storage/timeseries/builtin_timeseries.py", line 269, in bulk_insert
ts_enum_map[data_type].insert(entries)
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pymongo/collection.py", line 410, in insert
_check_write_command_response(results)
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pymongo/helpers.py", line 202, in _check_write_command_response
raise DuplicateKeyError(error.get("errmsg"), 11000, error)
DuplicateKeyError: insertDocument :: caused by :: 11000 E11000 duplicate key error index: Stage_database.Stage_timeseries.$_id_ dup key: { : ObjectId('56fe93e91268798022708b00') }
We used to get DuplicateErrors
before as well - handled them with try/catch. Now the insert it out of the try/catch. We need to use continue_on_error=True
http://stackoverflow.com/questions/10388540/insert-or-ignore-multiple-documents-in-mongodb/12268247
wrt https://github.com/e-mission/e-mission-server/issues/407#issuecomment-248486809, this is the actual error. We should really make this more robust as well.
2016-09-21 01:27:53,125:ERROR:140282532489024:Cleaning and resampling failed for user 1c8a8834-60d3-4dac-8e35-e97c73d434ca
Traceback (most recent call last):
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 72, in clean_and_resample
last_raw_place = save_cleaned_segments_for_ts(user_id, time_query.startTs, time_query.endTs)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 90, in save_cleaned_segments_for_ts
return save_cleaned_segments_for_timeline(user_id, tl)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 116, in save_cleaned_segments_for_timeline
(last_cleaned_place, filtered_tl) = create_and_link_timeline(tl, user_id, trip_map)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 779, in create_and_link_timeline
curr_cleaned_start_place = get_filtered_place(tl.first_place())
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 208, in get_filtered_place
_copy_non_excluded(old_data=raw_place.data,
AttributeError: 'NoneType' object has no attribute 'data'
Found error
Found error while processing trip Entry({u'data': {u'distance': 1161.6474633112616, u'end_place': ObjectId('57e01dd688f66307854b681e'), u'raw_trip': ObjectId('57e01dc888f66307854b669e'), u'start_loc': {u'type': u'Point', u'coordinates': <redacted>}, u'end_ts': 1473664937.272, u'start_ts': 1473664226.7945209, u'start_fmt_time': u'2016-09-12T09:10:26.794521', u'end_loc': {u'type': u'Point', u'coordinates': <redacted>}, u'source': u'DwellSegmentationTimeFilter', u'start_place': ObjectId('57e01dd688f66307854b681d'), u'end_fmt_time': u'2016-09-12T09:22:17.272000', u'end_local_dt': {u'hour': 9, u'month': 9, u'second': 17, u'weekday': 0, u'year': 2016, u'timezone': <redacted>, u'day': 12, u'minute': 22}, u'duration': 710.4774792194366, u'start_local_dt': {u'hour': 9, u'month': 9, u'second': 26, u'weekday': 0, u'year': 2016, u'timezone': u'', u'day': 12, u'minute': 10}}, u'_id': ObjectId('57e01dd588f66307854b67db'), u'user_id': UUID(<redacted>), u'metadata': {u'write_fmt_time': u'2016-09-19T10:18:13.132269-07:00', u'write_ts': 1474305493.132269, u'time_zone': u'America/Los_Angeles', u'platform': u'server', u'write_local_dt': {u'hour': 10, u'month': 9, u'second': 13, u'weekday': 0, u'year': 2016, u'timezone': u'America/Los_Angeles', u'day': 19, u'minute': 18}, u'key': u'analysis/cleaned_trip'}})
Traceback (most recent call last):
File "/mnt/e-mission/e-mission-server/emission/analysis/plotting/geojson/geojson_feature_converter.py", line 221, in get_geojson_for_timeline
trip_geojson = trip_to_geojson(trip, tl)
File "/mnt/e-mission/e-mission-server/emission/analysis/plotting/geojson/geojson_feature_converter.py", line 184, in trip_to_geojson
section_gj = section_to_geojson(section, tl)
File "/mnt/e-mission/e-mission-server/emission/analysis/plotting/geojson/geojson_feature_converter.py", line 117, in section_to_geojson
assert(False)
AssertionError
Traceback (most recent call last):
File "/home/ubuntu/anaconda/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/ubuntu/anaconda/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/mnt/e-mission/e-mission-server/emission/pipeline/intake_stage.py", line 58, in run_intake_pipeline
uh.moveToLongTerm()
File "/mnt/e-mission/e-mission-server/emission/net/usercache/builtin_usercache_handler.py", line 85, in moveToLongTerm
ts.bulk_insert(unified_entry_list, etsa.EntryType.DATA_TYPE, continue_on_error=True)
TypeError: bulk_insert() got an unexpected keyword argument 'continue_on_error'
Found error single positional indexer is out-of-bounds while processing trip Entry({u'data': {u'distance': 4.959556067689948, u'end_place': ObjectId('57e00f3288f66307854b2542'), u'start_loc': {u'type': u'Point', u'coordinates': }, u'end_ts': 1471814458.878665, u'start_ts': 1471813523.073008, u'start_fmt_time': u'2016-08-21T18:05:23.073008', u'end_loc': {u'type': u'Point', u'coordinates': [-46.47651544614448, -23.42357263594097]}, u'source': u'DwellSegmentationDistFilter', u'start_place': ObjectId('57e00f3288f66307854b2540'), u'end_fmt_time': u'2016-08-21T18:20:58.878665', u'end_local_dt': {u'hour': 18, u'month': 8, u'second': 58, u'weekday': 6, u'year': 2016, u'timezone': u'', u'day': 21, u'minute': 20}, u'duration': 935.8056569099426, u'start_local_dt': {u'hour': 18, u'month': 8, u'second': 23, u'weekday': 6, u'year': 2016, u'timezone': u'', u'day': 21, u'minute': 5}}, u'_id': ObjectId('57e00f3288f66307854b2541'), u'user_id': UUID(''), u'metadata': {u'write_fmt_time': u'2016-09-19T09:15:46.486968-07:00', u'write_ts': 1474301746.486968, u'time_zone': u'America/Los_Angeles', u'platform': u'server', u'write_local_dt': {u'hour': 9, u'month': 9, u'second': 46, u'weekday': 0, u'year': 2016, u'timezone': u'America/Los_Angeles', u'day': 19, u'minute': 15}, u'key': u'segmentation/raw_trip'}})
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 101, in save_cleaned_segments_for_timeline
filtered_trip = get_filtered_trip(ts, trip)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 166, in get_filtered_trip
section_map[section.get_id()] = get_filtered_section(filtered_trip_entry, section)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 255, in get_filtered_section
with_speeds_df = get_filtered_points(section, filtered_section_data)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 337, in get_filtered_points
filtered_loc_df = _add_start_point(filtered_loc_df, raw_start_place, ts)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 439, in _add_start_point
curr_first_point = filtered_loc_df.iloc[0]
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1286, in __getitem__
return self._getitem_axis(key, axis=0)
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1573, in _getitem_axis
self._is_valid_integer(key, axis)
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1487, in _is_valid_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
Found error single positional indexer is out-of-bounds while processing trip Entry({u'data': {u'distance': 605.5213502554404, u'end_place': ObjectId('57e16c3a88f6631d412f5d78'), u'start_loc': {u'type': u'Point', u'coordinates':}, u'end_ts': 1466656863, u'start_ts': 1466651357, u'start_fmt_time': u'2016-06-22T20:09:17', u'end_loc': {u'type': u'Point', u'coordinates':}, u'source': u'DwellSegmentationTimeFilter', u'start_place': ObjectId('57e16c3a88f6631d412f5d76'), u'end_fmt_time': u'2016-06-22T21:41:03-07:00', u'end_local_dt': {u'hour': 21, u'month': 6, u'second': 3, u'weekday': 2, u'year': 2016, u'timezone': u'America/Los_Angeles', u'day': 22, u'minute': 41}, u'duration': 5506, u'start_local_dt': {u'hour': 20, u'month': 6, u'second': 17, u'weekday': 2, u'year': 2016, u'timezone': u'America/Los_Angeles', u'day': 22, u'minute': 9}}, u'_id': ObjectId('57e16c3a88f6631d412f5d77'), u'user_id': UUID(''), u'metadata': {u'write_fmt_time': u'2016-09-20T10:04:58.877571-07:00', u'write_ts': 1474391098.877571, u'time_zone': u'America/Los_Angeles', u'platform': u'server', u'write_local_dt': {u'hour': 10, u'month': 9, u'second': 58, u'weekday': 1, u'year': 2016, u'timezone': u'America/Los_Angeles', u'day': 20, u'minute': 4}, u'key': u'segmentation/raw_trip'}})
Traceback (most recent call last):
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 101, in save_cleaned_segments_for_timeline
filtered_trip = get_filtered_trip(ts, trip)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 166, in get_filtered_trip
section_map[section.get_id()] = get_filtered_section(filtered_trip_entry, section)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 255, in get_filtered_section
with_speeds_df = get_filtered_points(section, filtered_section_data)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 365, in get_filtered_points
resampled_loc_df = resample(filtered_loc_df, interval=30)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 680, in resample
tz_new = [_get_timezone(ts, tz_ranges_df) for ts in ts_new]
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 698, in _get_timezone
return sel_entry.timezone.iloc[0]
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1286, in __getitem__
return self._getitem_axis(key, axis=0)
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1573, in _getitem_axis
self._is_valid_integer(key, axis)
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1487, in _is_valid_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
wrt https://github.com/e-mission/e-mission-server/issues/407#issuecomment-248493235, moved it to the timeseries.
wrt https://github.com/e-mission/e-mission-server/issues/407#issuecomment-248494724, we have a check for loc_df at the beginning of resample
def resample(filtered_loc_df, interval):
if len(loc_df) == 0 or len(loc_df) == 1:
logging.debug("len(loc_df) = %s, early return" % (len(loc_df)))
return loc_df
we should be able to handle missing points in add_first_point etc. and we can extrapolate to the end of the last trip. The only question is - what should use as the time if we have no points to determine speed from?
but not at the beginning of _add_start_point
2016-09-21 01:18:46,479:DEBUG:140282532489024:keys = (segmentation/raw_stop, segmentation/raw_section), len(places) = 0, len(trips) = 1
...
2016-09-21 01:18:46,494:DEBUG:140282532489024:deleting 5 points from section points
2016-09-21 01:18:46,498:DEBUG:140282532489024:Found 5 results
...
2016-09-21 01:18:46,503:DEBUG:140282532489024:After filtering, rows went from 5 -> 0, cols from 23 -> 22
...
2016-09-21 01:18:46,507:DEBUG:140282532489024:Raw place is 57e00f3288f66307854b2540 and corresponding location is 57bb92410a3eb8b1300477c3
2016-09-21 01:18:46,507:ERROR:140282532489024:Found error single positional indexer is out-of-bounds
What does it even mean to delete every single point? That's a zero length section, that is. But we don't have the concept of squishing sections, only trips and the squished trips gave us enough headaches.
In general, if we try to do something very fancy, we will fail. We need to find something simple. None thing to note is that the associated section is very mysterious. Although it shows up in intake log 0, and there is only one instance of that log, the first time that it shows up is in the cleaning stage.
Shouldn't it have shown up in the section segmentation stage?
$ bunzip2 -c intake_0.log.gz | grep 57e00f4288f66307854b25f6
2016-09-21 01:18:46,480:DEBUG:140282532489024:trips_or_sections = [ObjectId('57e00f4288f66307854b25f6')]
$ ubuntu@ip-10-203-173-119:/mnt/logs/emission$ ls *intake_0*
intake_0-errors.log.gz intake_0.log.gz
Yup, this time, SECTION_SEGMENTATION didn't do anything because it had already run. Basically, we got a bunch of errors in CLEAN_RESAMPLE earlier and now we are stuck because we are trying to reuse bad values. The good news is that we probably don't need to resegment, since it either the usercache or the CLEAN_RESAMPLE that is messed up.
2016-09-21 01:18:14,125:INFO:140282532489024:For stage PipelineStages.SECTION_SEGMENTATION, start_ts = 2016-09-16T20:28:45.362935
2016-09-21 01:18:14,126:DEBUG:140282532489024:for user 202c2eca-b079-43e3-aeec-4a1562ebefa9, returning timeseries <emission.storage.timeseries.builtin_timeseries.BuiltinTimeSeries object at 0x7f95f1e0f710>
2016-09-21 01:18:14,126:DEBUG:140282532489024:curr_query = {'$or': [{'metadata.key': 'segmentation/raw_trip'}], 'data.end_ts': {'$lte': 1474420689.125559, '$gte': 1474057725.362935}, 'user_id': UUID('202c2eca-b079-43e3-aeec-4a1562ebefa9')}, sort_key = data.end_ts
2016-09-21 01:18:14,126:DEBUG:140282532489024:orig_ts_db_keys = [], analysis_ts_db_keys = ['segmentation/raw_trip']
2016-09-21 01:18:14,126:DEBUG:140282532489024:finished querying values for []
2016-09-21 01:18:14,127:DEBUG:140282532489024:finished querying values for ['segmentation/raw_trip']
2016-09-21 01:18:14,392:DEBUG:140282532489024:Returning entry with length 0 result
2016-09-21 01:18:14,393:INFO:140282532489024:For stage PipelineStages.SECTION_SEGMENTATION, last_ts_processed is unchanged
Let's see what the outcome of the investigation into the big walk sections is.
There might have been a similar crash earlier, let's grep the prior logs.
wrt https://github.com/e-mission/e-mission-server/issues/407#issuecomment-248480746, that seems pretty high. Looking back at the metrics, I found several days with suspicious trips.
ModeStatTimeSummary({'nUsers': 1, 'ON_FOOT': 51625.38093002352, 'local_dt': LocalDate({'hour': 21, 'month': 9, 'second': 16, 'weekday': 2, 'year': 2015, 'timezone': 'UTC', 'day': 30, 'minute': 33}), 'ts': 1443648796, 'fmt_time': '2015-09-30T21:33:16.211080+00:00'}),
ModeStatTimeSummary({'nUsers': 1, 'ON_FOOT': 67935.68624558138, 'local_dt': LocalDate({'hour': 1, 'month': 12, 'second': 19, 'weekday': 6, 'year': 2015, 'timezone': 'UTC', 'day': 20, 'minute': 1}), 'ts': 1450573279, 'fmt_time': '2015-12-20T01:01:19.222216+00:00'}),
ModeStatTimeSummary({'nUsers': 1, 'ON_FOOT': 94123.80714983154, 'local_dt': LocalDate({'hour': 3, 'month': 12, 'second': 41, 'weekday': 5, 'year': 2015, 'timezone': 'UTC', 'day': 26, 'minute': 25}), 'ts': 1451100341, 'fmt_time': '2015-12-26T03:25:41.761620+00:00'}),
ModeStatTimeSummary({'nUsers': 1, 'ON_FOOT': 10611.549820243947, 'local_dt': LocalDate({'hour': 23, 'month': 4, 'second': 52, 'weekday': 4, 'year': 2016, 'timezone': 'UTC', 'day': 8, 'minute': 21}), 'ts': 1460157712, 'fmt_time': '2016-04-08T23:21:52.997999+00:00'}),
ModeStatTimeSummary({'nUsers': 1, 'ON_FOOT': 23612.511889582733, 'local_dt': LocalDate({'hour': 16, 'month': 4, 'second': 44, 'weekday': 3, 'year': 2016, 'timezone': 'UTC', 'day': 14, 'minute': 9}), 'ts': 1460650184, 'fmt_time': '2016-04-14T16:09:44.752386+00:00'}),
ModeStatTimeSummary({'nUsers': 1, 'ON_FOOT': 57683.09450387482, 'local_dt': LocalDate({'hour': 0, 'month': 7, 'second': 53, 'weekday': 5, 'year': 2016, 'timezone': 'UTC', 'day': 23, 'minute': 11}), 'ts': 1469232713, 'fmt_time': '2016-07-23T00:11:53.653433+00:00'}),
Looking at the points locally, we see suspicious looking sections for:
Why are we getting metrics for the other sections? What's wrong with these sections? May add a simple set of rules for walk and bike to avoid such outliers at this point.
Phone 1 | Phone 2 | Phone 3 |
---|---|---|
So the metrics are legit - the dates are confusing.
Here are the bad trips for the other days. The 23rd is missing on my computer.
Dec 20 | Dec 26 | 8 Apr |
---|---|---|
These are too varied to try and fix now. For now, I will just say that walk sections that are > 10 km and with speed > 10 kmph/hr should be marked as UNKNOWN. And bike sections that are > 100 km and with speed > 40 kmph/hr should be marked as UNKNOWN.
Beginners 13-15
Decent riders: 15-17
Better: 18-19
pretty good 19-20
Lower cat 3-4 Racers 20-22
upper cat 1-2 racers 22-25
winners 25+
That should work for everybody except race winners. And UNKNOWN isn't the end of the world as long as we allow people to edit it.
So technically, we should be able to nuke only the CLEAN_RESAMPLING
stage and rerun.
wrt https://github.com/e-mission/e-mission-server/issues/407#issuecomment-248514593, after grepping the prior logs, we see
2016-09-19 16:16:02,784:DEBUG:139656908773184:Inserting entry Entry({'data': Section({' sensed_mode': 4,
'start_loc': {"coordinates": [-46.47646789515715, -23.42356338461582], "type": "Point"},
'end_ts': 1471814458.878665,
'start_ts': 1471813523.073008,
'start_fmt_time': u'2016-08-21T18:05:23.073008-03:00',
'end_loc': {"coordinates": [-46.47651544614448, -23.42357263594097], "type": "Point"},
'source': 'SmoothedHighConfidenceMotion',
'end_fmt_time': u'2016-08-21T18:20:58.878665-03:00',
'duration': 935.8056569099426,
'trip_id': ObjectId('57e00f3288f66307854b2541'),
'_id': ObjectId('57e00f4288f66307854b25f6'),
'user_id': UUID('d9667aad-8afc-3619-9fce-7f70ffe1695c'),
'key': 'segmentation/raw_section'})}) into timeseries
2016-09-19 16:16:27,447:INFO:139656908773184:^^^^^^^^^^^^^^^^^^^^Smoothing section 57e00f4288f66307854b25f6 for user d9667aad-8afc-3619-9fce-7f70ffe1695c^^^^^^^^^^^^^^^^^^^^
2016-09-19 16:16:27,530:DEBUG:139656908773184:Found 5 results
2016-09-19 16:16:27,534:DEBUG:139656908773184:Found iOS section, filling in gaps with fake data
2016-09-19 16:16:27,535:DEBUG:139656908773184:Found large gap ending at 1, filling it
2016-09-19 16:16:27,536:DEBUG:139656908773184:start = 0, end = 1, generating entries between 1471813523.07 and 1471813857.45
2016-09-19 16:16:27,536:DEBUG:139656908773184:Fill lengths are: dist 5, angle 5, lat 5, lng 5
fmt_time ts latitude longitude \
0 2016-08-21T18:05:23.073008-03:00 1.471814e+09 -23.423563 -46.476468
1 NaN 1.471814e+09 -23.424014 -46.474633
2 NaN 1.471814e+09 -23.423764 -46.474931
3 NaN 1.471814e+09 -23.423649 -46.474971
4 NaN 1.471814e+09 -23.423604 -46.475048
fmt_time ts latitude longitude \
5 NaN 1.471814e+09 -23.423635 -46.475114
6 2016-08-21T18:10:57.452023-03:00 1.471814e+09 -23.423583 -46.476499
7 2016-08-21T18:15:58.169679-03:00 1.471814e+09 -23.423394 -46.476613
8 2016-08-21T18:16:07.169679-03:00 1.471814e+09 -23.423394 -46.476613
9 2016-08-21T18:20:58.878665-03:00 1.471814e+09 -23.423573 -46.476515
2016-09-19 16:16:27,613:DEBUG:139656908773184:For cluster 0 - 1, distance = 0.0, is_cluster = True
2016-09-19 16:16:27,615:DEBUG:139656908773184:For cluster 1 - 6, distance = 64.6947694402, is_cluster = True
2016-09-19 16:16:27,616:DEBUG:139656908773184:For cluster 6 - 10, distance = 2.04404326234, is_cluster = True
2016-09-19 16:16:27,619:DEBUG:139656908773184:bad_segments = [Segment(0, 1, 0.0), Segment(6, 10, 2.04404326234)]
2016-09-19 16:16:27,620:DEBUG:139656908773184:after setting values, outlier_mask = [0 6 7 8 9]
So it is related to the iOS fix. We add a bunch of fake points, we strip out all the real points, and then we drop the fake points with dropna()
so we have no real points left. returning the start point at the enter time seems like a good compromise.
2016-09-21 00:36:33,230:DEBUG:140735159468032:Considering trip 57e216e1f6858f4ba0fb8fa3: 2015-12-19T16:08:30.017000-08:00 -> 2015-12-19T16:08:59.991000-08:00
2016-09-21 00:36:33,231:DEBUG:140735159468032:Considering trip 57e216e1f6858f4ba0fb8fa5: 2015-12-19T17:30:50.123000-08:00 -> 2015-12-19T17:32:19.534000-08:00
2016-09-21 00:36:33,231:DEBUG:140735159468032:Considering trip 57e216e1f6858f4ba0fb8fa9: 2015-12-19T17:32:19.534000-08:00 -> 2015-12-20T18:14:09.001000-08:00
Looks like the section was actually fairly small and we extrapolated.
This is a problem with the squishing. We basically didn't do anything special to the sections because they were initially small, and then we squished them together and changed the distance and time.
small section
2016-09-20 22:25:25,864:DEBUG:140735159468032:Found first section, need to extrapolate start point
2016-09-20 22:25:25,865:DEBUG:140735159468032:get_entry_at_ts query = {'metadata.key': 'background/filtered_location', 'user_id': UUID('c1562232-a371-4272-bb4a-6f0f7b1fee18'), 'data.ts': 1450570139.991}
2016-09-20 22:25:25,866:DEBUG:140735159468032:Raw place is 57e216e1f6858f4ba0fb8fa4 and corresponding location is 56776eb95771abda98a06ce1
2016-09-20 22:25:25,878:DEBUG:140735159468032:Adding distance 198.578722673 to original 13.7514191908 to extend section start from [-119.5721166, 37.7375974] to [-119.574155, 37.738366]
2016-09-20 22:25:25,880:DEBUG:140735159468032:After subtracting time 1740.90078339 from original 89.4110000134 to cover additional distance 198.578722673 at speed 0.114066651338, new_start_ts = 1450573309.22
Look at the difference in the squishing mismatch.
2016-09-20 22:36:46,776:DEBUG:140735159468032:squishing mismatch: resetting trip start_loc [-119.574155, 37.738366] to cleaned_start_place.location [-120.2336796, 38.0547286]
For now, in the absence of time to do a better job, I'm just going to abandon the squishing if the gap is too much. We will end up with a big visual gap, but that is significantly worse than having giant fake trips.
If there is a huge gap, we should have an untracked section. Apparently we did, but we also had a bunch of squished trips. Need to investigate further later.
That fixed the 19th/20th, but not the 26th, because in that case it was actually detected as a trip in the first place.
2016-09-21 01:57:30,624:DEBUG:140735159468032:Considering place 57e216e2f6858f4ba0fb8fe4: 2015-12-25T18:45:43.458000-08:00 -> 2015-12-25T19:26:17.173000-08:00
2016-09-21 01:57:30,624:DEBUG:140735159468032:Considering place 57e216e2f6858f4ba0fb8fe6: 2015-12-26T20:35:31.522000-08:00 -> 2015-12-26T20:42:31.351000-08:00
Already a trip
2016-09-21 01:57:31,888:DEBUG:140735159468032:Considering trip 57e216e2f6858f4ba0fb8fe5: 2015-12-25T19:26:17.173000-08:00 -> 2015-12-26T20:35:31.522000-08:00
2016-09-20 20:36:46,143:DEBUG:140735159468032:------------------------------2015-12-25T19:27:11.072000-08:00------------------------------
2016-09-20 20:36:46,146:DEBUG:140735159468032:last5MinsDistances = [ 62.62879746 20.62203581] with length 2
2016-09-20 20:36:46,148:DEBUG:140735159468032:last10PointsDistances = [ 62.62879746 20.62203581 0. ] with length 3, shape (3,)
2016-09-20 20:36:46,150:DEBUG:140735159468032:len(last10PointsDistances) = 3, len(last5MinsDistances) = 2
2016-09-20 20:36:46,150:DEBUG:140735159468032:last5MinsTimes.max() = 53.8989999294, time_threshold = 300
2016-09-20 20:36:46,151:DEBUG:140735159468032:prev_point.ts = 1451100407.07, curr_point.ts = 1451100431.07, time gap = 23.9990000725 (vs 300), distance_gap = 20.6220358141 (vs 100), speed_gap = 0.859287293297 (vs 0.333333333333) continuing trip
2016-09-20 20:36:46,151:DEBUG:140735159468032:Too few points to make a decision, continuing
2016-09-20 20:36:46,152:DEBUG:140735159468032:------------------------------2015-12-26T20:27:14.622000-08:00------------------------------
2016-09-20 20:36:46,154:DEBUG:140735159468032:last5MinsDistances = [] with length 0
2016-09-20 20:36:46,156:DEBUG:140735159468032:last10PointsDistances = [ 93131.80518862 93173.47398066 93192.32726122 0. ] with length 4, shape (4,)
2016-09-20 20:36:46,157:DEBUG:140735159468032:len(last10PointsDistances) = 4, len(last5MinsDistances) = 0
2016-09-20 20:36:46,158:DEBUG:140735159468032:last5MinsTimes.max() = nan, time_threshold = 300
2016-09-20 20:36:46,158:DEBUG:140735159468032:prev_point.ts = 1451100431.07, curr_point.ts = 1451190434.62, time gap = 90003.55 (vs 300), distance_gap = 93192.3272612 (vs 100), speed_gap = 1.03542946096 (vs 0.333333333333) continuing trip
2016-09-20 20:36:46,158:DEBUG:140735159468032:Too few points to make a decision, continuing
Can either fix this by splitting at the TRIP_SEGMENTATION
level (if the gap is more than an hour, split), or at the CLEANING level.
For now, going to resolve this at the cleaning level. What if you did have a really long trip with a large gap in the middle - e.g. if you were underground or with bad reception for a long time? Although what really happened here was that we really were untracked for a long time, so maybe we should reflect that reality?
going with reflecting that by splitting for more than 12 hour gap.
Everything seems fixed now.
>>> list(edb.get_analysis_timeseries_db().find({"metadata.key": "analysis/cleaned_section", "data.distance": {"$gt": 10*1000}, "data.sensed_mode": ecwm.MotionTypes.ON_FOOT.value}))
[]
>>> list(edb.get_analysis_timeseries_db().find({"metadata.key": "analysis/cleaned_section", "data.distance": {"$gt": 50*1000}, "data.sensed_mode": ecwm.MotionTypes.BICYCLING.value}))
[]
>>> for s in list(edb.get_analysis_timeseries_db().find({"metadata.key": "analysis/cleaned_section", "data.distance": {"$gt": 3*1000}, "data.sensed_mode": ecwm.MotionTypes.ON_FOOT.value})):
print s["data"]["distance"], s["data"]["duration"], s["data"]["start_fmt_time"], pd.Series(s["data"]["speeds"]).median()
9499.65904538 9529.65904538 2015-07-15T08:53:17.792799-07:00 1.32702602016
4133.75881166 2136.11605334 2015-08-07T17:47:06.215947-07:00 1.68215469347
3663.24108658 3693.24108658 2015-12-08T12:18:26.893563-08:00 1.3832152966
5110.4193003 5140.4193003 2016-02-02T08:22:41.047555-08:00 2.63330721524
6068.1867019 6098.1867019 2016-02-15T10:46:20.997166-08:00 1.77030962961
3535.00596661 2107.50368905 2016-02-27T15:34:12.235311-08:00 1.2763648927
6872.03554323 4401.55729842 2016-02-06T14:48:22.262702-08:00 0.962844162818
3347.75565082 3377.75565082 2016-04-13T12:06:58.509962-07:00 1.42158653283
3307.74112384 3337.74112384 2016-02-09T10:24:14.288667-08:00 2.08387453137
3272.37948726 1246.07399988 2016-08-12T12:27:08.926000-07:00 2.06690369761
3130.47158556 3160.47158556 2015-10-20T15:10:49.006391-07:00 3.05647895952
4053.63350334 746 2015-09-24T19:48:23-07:00 5.1067850809 (MISCLASSIFIED FROM BIKE)
4430.30044196 1110.0508492 2015-08-06T15:41:29.344151-07:00 1.52645984651
3073.42221492 3103.42221492 2016-05-29T17:43:28.958045-07:00 1.02381084003
3513.83611733 1017.12506318 2015-08-08T09:11:03.109937-07:00 3.6925780055
3100.01001716 600.63499999 2015-07-13T15:30:31.576000-07:00 3.40898536236
>>> for s in list(edb.get_analysis_timeseries_db().find({"metadata.key": "analysis/cleaned_section", "data.distance": {"$gt": 10*1000}, "data.sensed_mode": ecwm.MotionTypes.BICYCLING.value})):
print s["data"]["distance"], s["data"]["duration"], s["data"]["start_fmt_time"], pd.Series(s["data"]["speeds"]).median()
<nothing>
Re-deploying. Override direct changes
$ git checkout emission/pipeline/intake_stage.py
$ git diff emission/storage/^C
$ git checkout emission/storage/timeseries/builtin_timeseries.py
$ ./e-mission-py.bash bin/reset_analysis_database.py
{u'ok': 1, u'n': 973561}
{u'ok': 1, u'n': 0}
{u'ok': 1, u'n': 0}
{u'ok': 1, u'n': 1347}
Relaunched collection. Did not monitor it. Was done by around 7pm when I checked. So I can re-run the entire pipeline for all current users in ~ 6 hours? Still need to check for errors.
Three fairly innocuous errors show up fairly consistently.
2016-09-21 22:04:46,720:ERROR:140290633410368:Backtrace time
Traceback (most recent call last):
File "/mnt/e-mission/e-mission-server/emission/net/usercache/builtin_usercache_handler.py", line 71, in moveToLongTerm
unified_entry = enuf.convert_to_common_format(entry)
File "/mnt/e-mission/e-mission-server/emission/net/usercache/formatters/formatter.py", line 6, in convert_to_common_format
return format_fn(entry)
File "/mnt/e-mission/e-mission-server/emission/net/usercache/formatters/ios/transition.py", line 55, in format
data.transition = transition_map[entry.data.transition].value
KeyError: None
2016-09-21 21:58:47,974:ERROR:140290633410368:Distance between points AttrDict({u'local_dt_hour': 22, u'fmt_time': u'2016-04-26T22:37:46.447609-07:00', u'local_dt_timezone': u'America/Los_Angeles', u'local_dt_second': 46, u'local_dt_year': 2016, u'ts': 1461735466.447609, u'local_dt_minute': 37, u'local_dt_weekday': 1, u'loc': {u'type': u'Point', u'coordinates': [-122.253448856831, 37.86768408032536]}, 'user_id': UUID('43f9361e-1cb6-4026-99ba-458be357d245'), u'floor': 0, 'metadata_write_ts': 1461735468.235691, u'altitude': 93.71350860595703, u'vaccuracy': 10.0, u'local_dt_month': 4, u'longitude': -122.253448856831, u'filter': u'distance', u'local_dt_day': 26, u'latitude': 37.86768408032536, '_id': ObjectId('5720522ed282b8f30deba376'), u'heading': -1.0, u'sensed_speed': -1.0, u'accuracy': 121.0280224618877}), AttrDict({u'local_dt_hour': 22, u'fmt_time': u'2016-04-26T22:37:46.447609-07:00', u'local_dt_timezone': u'America/Los_Angeles', u'local_dt_second': 46, u'local_dt_year': 2016, u'ts': 1461735466.447609, u'local_dt_minute': 37, u'local_dt_weekday': 1, u'loc': {u'type': u'Point', u'coordinates': [-122.2531523366209, 37.86722646225704]}, 'user_id': UUID('43f9361e-1cb6-4026-99ba-458be357d245'), u'floor': 0, 'metadata_write_ts': 1461735468.175485, u'altitude': 93.75988006591797, u'vaccuracy': 10.0, u'local_dt_month': 4, u'longitude': -122.2531523366209, u'filter': u'distan ce', u'local_dt_day': 26, u'latitude': 37.86722646225704, '_id': ObjectId('5720522ed282 b8f30deba374'), u'heading': -1.0, u'sensed_speed': -1.0, u'accuracy': 121.0280224618877 }) is 57.1556044127, although the time delta = 0
2016-09-21 21:09:24,806:ERROR:140290633410368:Unable to pre-fill reverse geocoded information, client has to do it
Traceback (most recent call last):
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 220, in get_filtered_place
filtered_place_data.display_name = format_result(reverse_geocoded_json)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 1068, in format_result
name = "%s, %s" % (get_fine(rev_geo_result), get_coarse(rev_geo_result))
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 1066, in <lambda>
get_fine = lambda rgr: get_with_fallbacks(rgr["address"], ["road", "neighbourhood"])
KeyError: 'address'
All these three errors are already handled, so maybe they should be downgraded to warnings instead of errors.
process | None transition | Distance between | no geocoded |
---|---|---|---|
0 | 5 | 5 | 134 |
1 | 1 | 8 | 42 |
2 | 0 | 12 | 91 |
These cover most of the messages in the error logs. Except for process 1, which had some real errors left.
$ bzcat intake_0-errors.log.gz | grep ERROR | egrep -v "Distance between points|Unable to pre-fill|Backtrace time" | wc -l
0
$ bzcat intake_1-errors.log.gz | grep ERROR | egrep -v "Distance between points|Unable to pre-fill|Backtrace time" | wc -l
6
$ bzcat intake_2-errors.log.gz | grep ERROR | egrep -v "Distance between points|Unable to pre-fill|Backtrace time" | wc -l
0
Let's take a look at the real errors on process 1. We need to pull the data for these users and re-run locally to experiment.
Traceback (most recent call last):
File "/mnt/e-mission/e-mission-server/emission/analysis/plotting/geojson/geojson_feature_converter.py", line 221, in get_geojson_for_timeline
trip_geojson = trip_to_geojson(trip, tl)
File "/mnt/e-mission/e-mission-server/emission/analysis/plotting/geojson/geojson_feature_converter.py", line 184, in trip_to_geojson
section_gj = section_to_geojson(section, tl)
File "/mnt/e-mission/e-mission-server/emission/analysis/plotting/geojson/geojson_feature_converter.py", line 117, in section_to_geojson
assert(False)
AssertionError
2016-09-21 20:45:27,127:ERROR:140290633410368:Storing views to cache failed for user 60fb5b80-e6c7-4a2b-ba69-ad3147712df3
Traceback (most recent call last):
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 101, in save_cleaned_segments_for_timeline
filtered_trip = get_filtered_trip(ts, trip)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 166, in get_filtered_trip
section_map[section.get_id()] = get_filtered_section(filtered_trip_entry, section)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 256, in get_filtered_section
with_speeds_df = get_filtered_points(section, filtered_section_data)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 356, in get_filtered_points
(filtered_section_data, filtered_loc_df.iloc[0], filtered_loc_df.iloc[-1]))
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1286, in __getitem__
return self._getitem_axis(key, axis=0)
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1573, in _getitem_axis
self._is_valid_integer(key, axis)
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1487, in _is_valid_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
2016-09-21 21:26:47,967:ERROR:140290633410368:Cleaning and resampling failed for user 95cf2b3e-f3c0-4da5-b8d0-0e460fad424d
Traceback (most recent call last):
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 101, in save_cleaned_segments_for_timeline
filtered_trip = get_filtered_trip(ts, trip)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 166, in get_filtered_trip
section_map[section.get_id()] = get_filtered_section(filtered_trip_entry, section)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 256, in get_filtered_section
with_speeds_df = get_filtered_points(section, filtered_section_data)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 368, in get_filtered_points
resampled_loc_df = resample(filtered_loc_df, interval=30)
File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_res
ample.py", line 713, in resample
tz_new = [_get_timezone(ts, tz_ranges_df) for ts in ts_new] File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_res
ample.py", line 731, in _get_timezone return sel_entry.timezone.iloc[0]
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1286, in __getitem__
return self._getitem_axis(key, axis=0)
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1573, in _getitem_axis
self._is_valid_integer(key, axis)
File "/home/ubuntu/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", lin
e 1487, in _is_valid_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
2016-09-21 22:16:58,767:ERROR:140290633410368:Cleaning and resampling failed for user 48123fc6-a631-41fb-a68c-ed3d518bcc0d
2016-09-22 12:48:40,708:INFO:140735159468032:section_location_array[-1].data.loc {"coordinates": [12.5923698, 55.6602518], "type": "Point"} != section.data.end_loc
{"coordinates": [12.59236978, 55.66025177], "type": "Point"} even after df.ts fix, filling gap
Easy fix!
This was a doozy caused by out of order points.
2016-09-22 14:40:09,193:DEBUG:140735159468032:with_speeds_df = fmt_time ts latitude longitude \
0 2016-06-22T21:43:57-07:00 1.466657e+09 37.867698 -122.259759
1 2016-06-22T20:09:17-07:00 1.466651e+09 37.868355 -122.252893
2 2016-06-22T21:31:42.502000-07:00 1.466656e+09 37.868315 -122.254355
3 2016-06-22T21:31:49-07:00 1.466656e+09 37.868279 -122.253637
4 2016-06-22T21:32:12.061000-07:00 1.466656e+09 37.868558 -122.254532
speed distance
0 0.000000 0.000000
1 -0.106885 607.108403
2 0.025971 128.441850
3 9.725902 63.198908
4 3.666813 84.560384
Ooh! The place has a negative duration. enter is 21:40 but exit is 20:09. WTF!
2016-09-22 14:37:45,
812:DEBUG:140735159468032:Saving entry Entry({'data': Rawplace({'enter_fmt_time': u'2016-06-22T21:40:57-07:00',
'ending_trip': ObjectId('57e44f29f6858f741c41e1c8'),
'starting_trip': ObjectId('57e44f29f6858f741c41e1ca'),
'duration': -5500,
'exit_fmt_time': u'2016-06-22T20:09:17-07:00',
'source': 'DwellSegmentationTimeFilter',
'location': {"coordinates": [-122.2597589, 37.8676984], "type": "Point"},
'enter_ts': 1466656857,
'exit_ts': 1466651357}),
'_id': ObjectId('57e44f29f6858f741c41e1c9'),
'user_id': UUID('96993ce8-e550-4535-aa28-836804596d92'),
'key': 'segmentation/raw_place'})}) into timeseries
Aha! Stupid phone!!! Sent us points out of order. We sort by metadata.write_ts, but it looks like the data timestamps were out of order.
2016-09-22 14:33:47,790:DEBUG:140735159468032:------------------------------2016-06-22T21:43:10-07:00------------------------------
2016-09-22 14:33:47,794:DEBUG:140735159468032:------------------------------2016-06-22T20:09:17-07:00------------------------------
2016-09-22 14:33:47,801:DEBUG:140735159468032:------------------------------2016-06-22T21:43:37.678000-07:00------------------------------
Note that the metadata write ts is in order
'metadata_write_ts': 1466656989.717,
'metadata_write_ts': 1466657016.201,
While the timestamp is not
u'ts': 1466656990.0,
u'ts': 1466651357.0,
and the elapsedRealTimeNanos is also in order.
u'elapsedRealtimeNanos': 610100100257967,
u'elapsedRealtimeNanos': 610126618689207,
I wonder if this is caused by a rebooted phone? https://developer.android.com/reference/android/location/Location.html#getTime()
How should we fix this? Should we automatically try to fix the location? Probably safest is to just drop the point. Too hard to figure out how to fix it right now. For the record, the points are
[-122.2598261, 37.8676886]
[-122.25289285182953, 37.86835491657257]
So they don't look very close to each other. It is probably not worth it to try and fix. This is incredibly rare - once in multiple days of data over multiple users. So let's just drop and move on.
This seems to be intermittent - did not reproduce when I tried it locally. Will retry and run again.
So let's just drop and move on.
But how do I drop? Just ignoring it means that I will need to ignore it all the time, which is a bad, fragile solution. I will just remove if from filtered_location
in rare case of deleting the timeseries. It is still in the location
, so no real harm is done.
This seems to be intermittent - did not reproduce when I tried it locally. Will retry and run again.
Tried two more times - did not recur.
Error is from here:
# Can move this up to get_filtered_section if we want to ensure that we
# don't touch filtered_section_data in here
logging.debug("Setting section start and end values of %s to first point %s"
"and last point %s" %
(filtered_section_data, filtered_loc_df.iloc[0], filtered_loc_df.iloc[-1]))
_set_extrapolated_vals_for_section(filtered_section_data,
filtered_loc_df.iloc[0],
filtered_loc_df.iloc[-1])
I guess if there are no filtered points, and this is not the first or last section, we can run into this. In that case, let's insert one entry for the first unfiltered location. We should assume that remove_outliers
should return a loc_df
with at least one entry.
This last one took a bit of time because it is not reproducible, so I had to actually craft a test case for it. But I think that it works reasonably well now.
https://github.com/shankari/e-mission-server/commit/ea80978cdef7a4f84e92fccbb51ba5581713571f has introduced a regression - Sunil's trip now looks like this.
In spite of the fact that the points show a clear zigzag.
Trip | Not cleaned |
---|---|
All pending fixed merged. Time for another... deployment! We will have to reset everything anyway because of the trip segmentation and smoothing fixes.
$ git pull
remote: Counting objects: 95, done.
remote: Compressing objects: 100% (95/95), done.
remote: Total 95 (delta 45), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (95/95), done.
From https://github.com/amplab/e-mission-server
52a81d3..4808ebf master -> origin/master
Updating 52a81d3..4808ebf
Fast-forward
Delete existing analysis
$ ./e-mission-py.bash bin/reset_analysis_database.py
{u'ok': 1, u'n': 189490}
{u'ok': 1, u'n': 0}
{u'ok': 1, u'n': 0}
{u'ok': 1, u'n': 147}
Relaunch! Should be done by tomorrow morning.
$ ./e-mission-py.bash bin/intake_multiprocess.py 3 2>&1 | tee /mnt/logs/emission/intake.stdinout
Woo! No errors!
ubuntu@ip-10-203-173-119:/mnt/logs/emission$ ls -alh intake*error*
-rw-rw-r-- 1 ubuntu ubuntu 0 Sep 23 09:46 intake_0-errors.log.gz
-rw-rw-r-- 1 ubuntu ubuntu 0 Sep 23 09:46 intake_1-errors.log.gz
-rw-rw-r-- 1 ubuntu ubuntu 0 Sep 23 09:46 intake_2-errors.log.gz
-rw-rw-r-- 1 ubuntu ubuntu 0 Sep 23 09:46 intake_launcher-errors.log.gz
Conficts in a bunch of files.
Checked out
Copied over
Created habitica config and copied values from keys.
Webserver starts up fine.
Time to restart the pipeline now.