e-mission / e-mission-docs

Repository for docs and issues. If you need help, please file an issue here. Public conversations are better for open source projects than private email.
https://e-mission.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
15 stars 34 forks source link

Track the first running of the pipeline reset to handle the badly formatted android motion_activity #260

Open shankari opened 7 years ago

shankari commented 7 years ago

It's just useful to document the steps for the record

shankari commented 7 years ago

First, determine the format of the new motion activity objects, and change the server code to account for them.

shankari commented 7 years ago

Here's an example

In [515]: edb.get_timeseries_error_db().find_one({"metadata.key": "background/motion_activity"})
Out[515]:
{u'_id': ObjectId('594c9334cb17471ac033ee7e'),
 u'data': {u'zzbjA': 4, u'zzbjB': 75},
 u'metadata': {u'key': u'background/motion_activity',
  u'platform': u'android',
  u'read_ts': 0,
  u'time_zone': u'America/Los_Angeles',
  u'type': u'sensor-data',
  u'write_ts': 1498187291.604},
 u'user_id': UUID('08b31565-f990-4d15-a4a7-89b3ba6b1340')}

The write_ts is

In [120]: arrow.get(1498187291)
Out[120]: <Arrow [2017-06-23T03:08:11+00:00]>

Last release was on Jun 9, 2017. So this is definitely the new format.

Let's do some more sorting...

In [517]: list(edb.get_timeseries_error_db().find({"metadata.key": "background/motion_activity"}).sort("metadata.write_ts", 1).limit(3))
Out[517]:
[{u'_id': ObjectId('5923ce3acb17471ac015a27b'),
  u'data': {u'zzbjA': 3, u'zzbjB': 77},
  u'metadata': {u'key': u'background/motion_activity',
   u'platform': u'android',
   u'read_ts': 0,
   u'time_zone': u'America/Los_Angeles',
   u'type': u'sensor-data',
   u'write_ts': 1495343353.589},
  u'user_id': UUID('fd7b4c2e-2c8b-3bfa-94f0-d1e3ecbd5fb7')},
 {u'_id': ObjectId('5923ce3acb17471ac015a27c'),
  u'data': {u'zzbjA': 5, u'zzbjB': 100},
  u'metadata': {u'key': u'background/motion_activity',
   u'platform': u'android',
   u'read_ts': 0,
   u'time_zone': u'America/Los_Angeles',
   u'type': u'sensor-data',
   u'write_ts': 1495343353.614},
  u'user_id': UUID('fd7b4c2e-2c8b-3bfa-94f0-d1e3ecbd5fb7')},
 {u'_id': ObjectId('5923ce3acb17471ac015a28d'),
  u'data': {u'zzbjA': 3, u'zzbjB': 69},
  u'metadata': {u'key': u'background/motion_activity',
   u'platform': u'android',
   u'read_ts': 0,
   u'time_zone': u'America/Los_Angeles',
   u'type': u'sensor-data',
   u'write_ts': 1495343913.022},
  u'user_id': UUID('fd7b4c2e-2c8b-3bfa-94f0-d1e3ecbd5fb7')}]

And

[{u'_id': ObjectId('595b2bf0cb17471ac03ee644'),
  u'data': {u'zzbjA': 4, u'zzbjB': 35},
  u'metadata': {u'key': u'background/motion_activity',
   u'platform': u'android',
   u'read_ts': 0,
   u'time_zone': u'America/Phoenix',
   u'type': u'sensor-data',
   u'write_ts': 1499145456.502},
  u'user_id': UUID('7c8bfcff-2a95-4da3-b0bb-027de2ad06eb')},
 {u'_id': ObjectId('595b2bf0cb17471ac03ee641'),
  u'data': {u'zzbjA': 3, u'zzbjB': 46},
  u'metadata': {u'key': u'background/motion_activity',
   u'platform': u'android',
   u'read_ts': 0,
   u'time_zone': u'America/Phoenix',
   u'type': u'sensor-data',
   u'write_ts': 1499145426.221},
  u'user_id': UUID('7c8bfcff-2a95-4da3-b0bb-027de2ad06eb')},
 {u'_id': ObjectId('595b2bf0cb17471ac03ee63e'),
  u'data': {u'zzbjA': 3, u'zzbjB': 92},
  u'metadata': {u'key': u'background/motion_activity',
   u'platform': u'android',
   u'read_ts': 0,
   u'time_zone': u'America/Phoenix',
   u'type': u'sensor-data',
   u'write_ts': 1499145396.142},
  u'user_id': UUID('7c8bfcff-2a95-4da3-b0bb-027de2ad06eb')}]

Seems pretty consistent.

shankari commented 7 years ago

Now, update the code on the server.

$ git checkout webapp/www/templates/home.html
$ git pull
Updating 3a97d04..b810170
Fast-forward
 bin/reset_pipeline.py                                   | 110 ++++-----
 emission/analysis/classification/inference/mode.py      |   3 +-
 .../net/usercache/formatters/android/motion_activity.py |   4 +
 emission/pipeline/reset.py                              | 208 +++++++++++++++++
 emission/storage/decorations/place_queries.py           |  66 ++++++
 .../analysisTests/intakeTests/TestPipelineRealData.py   |   2 +-
 emission/tests/pipelineTests/TestPipelineReset.py       | 284 ++++++++++++++++++++++++
 webapp/www/templates/home.html                          |   4 +-
 8 files changed, 627 insertions(+), 54 deletions(-)
 create mode 100644 emission/pipeline/reset.py
 create mode 100644 emission/tests/pipelineTests/TestPipelineReset.py
shankari commented 7 years ago

Next, run the script that copies the error elements to the usercache and then to long-term.

$ ./e-mission-py.bash bin/debug/fix_usercache_processing.py 2>&1 | tee /tmp/fix_usercache.log

Errors all appear to be due to the None transition

$ grep error /tmp/fix_usercache.log
...
DEBUG:root:Inserting entry {u'data': {u'transition': None, u'currState': u'STATE_START'}, u'_id': ObjectId('57ae4b90f311e474bfdaffdf'), u'user_id': UUID('fbd32183-889c-4ba1-92a1-4c2ef2af6d2b'), u'metadata': {u'plugin': u'none', u'write_ts': 1471012495.27217, u'time_zone': u'America/New_York', u'platform': u'ios', u'key': u'statemachine/transition', u'read_ts': 1499199721.016257, u'type': u'message'}} into error timeseries

$ grep error /tmp/fix_usercache.log | egrep -v "'transition': None" | grep -v insert_error
INFO:root:Found 64898 errors in this round
DEBUG:root:module_name = emission.net.usercache.formatters.android.client_error
DEBUG:root:module_name = emission.net.usercache.formatters.android.client_error
DEBUG:root:module_name = emission.net.usercache.formatters.android.client_error
DEBUG:root:module_name = emission.net.usercache.formatters.android.client_error
DEBUG:root:module_name = emission.net.usercache.formatters.android.client_error
DEBUG:root:module_name = emission.net.usercache.formatters.android.client_error
DEBUG:root:module_name = emission.net.usercache.formatters.android.client_error
DEBUG:root:module_name = emission.net.usercache.formatters.android.client_error
DEBUG:root:module_name = emission.net.usercache.formatters.android.client_error
DEBUG:root:module_name = emission.net.usercache.formatters.android.client_error
shankari commented 7 years ago

Now, let's reset all the android pipelines to Jun 9th. Actually, I don't want to reset entries that are before the 9th to the 9th. Fix that with https://github.com/e-mission/e-mission-server/pull/499 Then pull again.

shankari commented 7 years ago

Now, reset to Jun 7th, to make it a bit safer. There were some issues with the launch script, fix those. https://github.com/e-mission/e-mission-server/pull/500

shankari commented 7 years ago

Now, reset actually reset to Jun 7th. Let's do a dry-run before doing something irreversible.

$ ./e-mission-py.bash bin/reset_pipeline.py -p android -d 2017-06-07 -n
Namespace(all=False, date='2017-06-07', dry_run=True, email_list=None, platform='android', user_list=None)
DEBUG:root:day_dt is 2017-06-07T00:00:00+00:00
DEBUG:root:day_ts is 1496793600

Waited for a long time. Looks like distinct is slow. Added some more logging, but probably just have to wait.... https://github.com/e-mission/e-mission-server/pull/501

shankari commented 7 years ago

Failed because the person uninstalled before the reset date, so we couldn't find any places or trips straddling it.

 {u'_id': ObjectId('58461a0888f6637521fe5a67'),
 u'data': {u'enter_fmt_time': u'2016-12-05T12:43:30.403000-08:00',
 u'exit_fmt_time': u'2016-12-05T17:38:08.072380-08:00'}},
 {u'_id': ObjectId('58461a0888f6637521fe5a66'),
 u'data': {u'enter_fmt_time': u'2016-12-05T10:54:36.502000-08:00',
 u'exit_fmt_time': u'2016-12-05T12:30:15.657499-08:00'}},
 {u'_id': ObjectId('58461a0888f6637521fe5a65'),
 u'data': {u'enter_fmt_time': u'2016-12-05T10:44:57.670000-08:00',
 u'exit_fmt_time': u'2016-12-05T10:44:57.670000-08:00'}},
 {u'_id': ObjectId('58424d6388f66329354efd2c'),
 u'data': {u'enter_fmt_time': u'2016-12-02T17:08:36.578000-08:00',
 u'exit_fmt_time': u'2016-12-02T17:08:36.578000-08:00'}}]
DEBUG:root:last place doc = None
Traceback (most recent call last):
  File "bin/reset_pipeline.py", line 83, in <module>
    epr.reset_user_to_ts(user_id, day_ts, args.dry_run)
  File "/mnt/e-mission/e-mission-server/emission/pipeline/reset.py", line 42, in reset_user_to_ts
    last_cleaned_place = esdp.get_last_place_before(esda.CLEANED_PLACE_KEY, ts, user_id)
  File "/mnt/e-mission/e-mission-server/emission/storage/decorations/place_queries.py", line 75, in get_last_place_before
    raise ValueError("No trip or place straddling time %s" % reset_ts)
ValueError: No trip or place straddling time 1496793600
shankari commented 7 years ago

Failed in spite of the fix, adding additional logging and redirecting output to a file.

DEBUG:root:last place doc = None
DEBUG:root:last place doc = {u'user_id': UUID('5a6a2711-c574-42f0-9940-ea1fd0cc2f09'),
 u'_id': ObjectId('5951256188f66373218bcd06'),
 u'data': {u'enter_fmt_time': u'2017-06-22T15:35:03.034000-07:00',
 u'display_name': u', ',
 u'ending_trip': ObjectId('5951256088f66373218bccf0'),
 u'source': u'DwellSegmentationTimeFilter',
 u'location': {u'type': u'Point', u'coordinates': [-121.3834927, 40.4237075]},
 u'enter_ts': 1498170903.034,
 u'raw_places': [ObjectId('5951254a88f66373218bca86'),
 ObjectId('5951254a88f66373218bca86'),
 ObjectId('5951254a88f66373218bca8c'),
 ObjectId('5951254a88f66373218bca86'),
 ObjectId('5951254a88f66373218bca8c'),
 ObjectId('5951254a88f66373218bca86'),
 ObjectId('5951254a88f66373218bca8c'),
 ObjectId('5951254a88f66373218bca86'),
 ObjectId('5951254a88f66373218bca8c'),
 ObjectId('5951254a88f66373218bca86'),
 ObjectId('5951254a88f66373218bca8c'),
 ObjectId('5951254a88f66373218bca86'),
 ObjectId('5951254a88f66373218bca8c'),
 ObjectId('5951254a88f66373218bca86'),
 ObjectId('5951254a88f66373218bca8c'),
 ObjectId('5951254a88f66373218bca86'),
 ObjectId('5951254a88f66373218bca8c'),
 ObjectId('5951254a88f66373218bca86'),
 ObjectId('5951254a88f66373218bca8c')]},
 u'metadata': {u'write_fmt_time': u'2017-06-26T08:16:49.685577-07:00',
 u'write_ts': 1498490209.685577,
 u'time_zone': u'America/Los_Angeles',
 u'platform': u'server',
 u'key': u'analysis/cleaned_place'}}
Traceback (most recent call last):
  File "bin/reset_pipeline.py", line 83, in <module>
    epr.reset_user_to_ts(user_id, day_ts, args.dry_run)
  File "/mnt/e-mission/e-mission-server/emission/pipeline/reset.py", line 42, in reset_user_to_ts
    last_cleaned_place = esdp.get_last_place_before(esda.CLEANED_PLACE_KEY, ts, user_id)
  File "/mnt/e-mission/e-mission-server/emission/storage/decorations/place_queries.py", line 82, in get_last_place_before
    raise ValueError("No trip or place straddling time %s" % reset_ts)
ValueError: No trip or place straddling time 1496793600
shankari commented 7 years ago

With the additional logging, ran into issues when user_id = None Fixed by adding a special check for user_id is None and doing an early return

shankari commented 7 years ago

Then ran into an issue where we have entries for a user id, but no analysis results, and no entry in the uuid_db.

INFO:root:resetting user 95725e34-2e58-3ebd-9f1a-41f9e0131079 to ts 1496793600
DEBUG:root:Looking for last place before 1496793600
DEBUG:root:all places for this user = []
DEBUG:root:last place doc for user 95725e34-2e58-3ebd-9f1a-41f9e0131079 = None
DEBUG:root:last trip doc for user 95725e34-2e58-3ebd-9f1a-41f9e0131079 = None
DEBUG:root:last place doc = None
Namespace(all=False, date='2017-06-07', dry_run=True, email_list=None, platform='android', user_list=None)
Traceback (most recent call last):
  File "bin/reset_pipeline.py", line 83, in <module>
    epr.reset_user_to_ts(user_id, day_ts, args.dry_run)
  File "/mnt/e-mission/e-mission-server/emission/pipeline/reset.py", line 46, in reset_user_to_ts
    last_cleaned_place = esdp.get_last_place_before(esda.CLEANED_PLACE_KEY, ts, user_id)
  File "/mnt/e-mission/e-mission-server/emission/storage/decorations/place_queries.py", line 79, in get_last_place_before
    (last_place.data.enter_ts, reset_ts))
AttributeError: 'NoneType' object has no attribute 'data'
In [529]: edb.get_uuid_db().find_one({"uuid":  UUID('95725e34-2e58-3ebd-9f1a-41f9e0131079')})
In [530]: edb.get_timeseries_db().find({"user_id": UUID("95725e34-2e58-3ebd-9f1a-41f9e0131079")}).count()
Out[530]: 4319
In [531]: edb.get_analysis_timeseries_db().find({"user_id": UUID("95725e34-2e58-3ebd-9f1a-41f9e0131079")}).count()
Out[531]: 0
In [532]: edb.get_timeseries_db().find({"user_id": UUID("95725e34-2e58-3ebd-9f1a-41f9e0131079")}).distinct("metadata.key")
Out[532]:
[u'background/battery',
 u'statemachine/transition',
 u'background/location',
 u'background/motion_activity',
 u'background/filtered_location']
shankari commented 7 years ago

Ran into the straddling issue again

Namespace(all=False, date='2017-06-07', dry_run=True, email_list=None, platform='android', user_list=None)
Traceback (most recent call last):
  File "bin/reset_pipeline.py", line 83, in <module>
    epr.reset_user_to_ts(user_id, day_ts, args.dry_run)
  File "/mnt/e-mission/e-mission-server/emission/pipeline/reset.py", line 46, in reset_user_to_ts
    last_cleaned_place = esdp.get_last_place_before(esda.CLEANED_PLACE_KEY, ts, user_id)
  File "/mnt/e-mission/e-mission-server/emission/storage/decorations/place_queries.py", line 86, in get_last_place_before
    (reset_ts, user_id))
ValueError: No trip or place straddling time 1496793600 for user 5a6a2711-c574-42f0-9940-ea1fd0cc2f09

And it has a valid UUID entry

In [547]: edb.get_uuid_db().find_one({"uuid": UUID("5a6a2711-c574-42f0-9940-ea1fd0cc2f09")})
Out[547]:
{u'_id': ObjectId('5939a0a5cb17471ac0273514'),
 u'update_ts': datetime.datetime(2017, 6, 8, 19, 8, 21, 584000),
 u'user_email': <redacted>,
 u'uuid': UUID('5a6a2711-c574-42f0-9940-ea1fd0cc2f09')}

Analysis entries

In [540]: edb.get_analysis_timeseries_db().find({"user_id": UUID("5a6a2711-c574-42f0-9940-ea1fd0cc2f09")}).count()
Out[540]: 2083

Including a bunch of cleaned places

In [541]: edb.get_analysis_timeseries_db().find({"user_id": UUID("5a6a2711-c574-42f0-9940-ea1fd0cc2f09")}).distinct("metadata.key")
Out[541]:
[u'segmentation/raw_trip',
 u'segmentation/raw_place',
 u'segmentation/raw_section',
 u'segmentation/raw_stop',
 u'analysis/smoothing',
 u'analysis/recreated_location',
 u'analysis/cleaned_section',
 u'analysis/cleaned_stop',
 u'analysis/cleaned_place',
 u'analysis/cleaned_trip',
 u'segmentation/raw_untracked',
 u'analysis/cleaned_untracked']

Hm. The first one is around the time of the reset. Let's convert our reset time to the correct timezone and see what we find.

In [545]: list(edb.get_analysis_timeseries_db().find({"user_id": UUID("5a6a2711-c574-42f0-9940-ea1fd0cc2f09"), "metadata.key": "analysis/cleaned_place"}, {"data.enter_fmt_time": 1, "data.exit_fmt_time": 1}).sort("data.enter_ts", 1))
Out[545]:
[{u'_id': ObjectId('593a322788f66336cce330cf'),
  u'data': {u'exit_fmt_time': u'2017-06-08T18:07:32.986396-07:00'}},
 {u'_id': ObjectId('593a322788f66336cce330d0'),
  u'data': {u'enter_fmt_time': u'2017-06-08T18:27:27-07:00',
   u'exit_fmt_time': u'2017-06-08T19:51:47.571543-07:00'}},
 {u'_id': ObjectId('593a322888f66336cce330d2'),
  u'data': {u'enter_fmt_time': u'2017-06-08T20:03:47-07:00',
   u'exit_fmt_time': u'2017-06-08T20:05:52-07:00'}},
 {u'_id': ObjectId('593c4e7b88f66321358e9477'),
  u'data': {u'enter_fmt_time': u'2017-06-08T20:10:42-07:00',
   u'exit_fmt_time': u'2017-06-08T20:10:42-07:00'}},
...

The first full place entry is 2017-06-08T18:27:27-07:00 - 2017-06-08T19:51:47.571543-07:00. We are resetting to 2017-06-07 UTC = 2017-06-06T17:00:00-07:00 Since all places are after the reset_ts, none of them will straddle it, and the last place will also be after the reset_ts.

DEBUG:root:last place doc for user 5a6a2711-c574-42f0-9940-ea1fd0cc2f09 = None
DEBUG:root:last trip doc for user 5a6a2711-c574-42f0-9940-ea1fd0cc2f09 = None
DEBUG:root:last_place = Entry({u'data': 
{u'enter_fmt_time': u'2017-06-22T15:35:03.034000-07:00',
 u'display_name': u',
 ',
 u'enter_ts': 1498170903.034,
 u'_id': ObjectId('5951256188f66373218bcd06'),
 u'user_id': UUID('5a6a2711-c574-42f0-9940-ea1fd0cc2f09'),
 u'metadata': {u'write_fmt_time': u'2017-06-26T08:16:49.685577-07:00',
 u'key': u'analysis/cleaned_place'}}),
 reset_ts = 1496793600

In this case, we want to fall back to resetting to the start.

shankari commented 7 years ago

Not sure that this is actually resolved!

shankari commented 7 years ago

Made a minor change for performance. Looking at all the entries and getting the distinct user_ids from them takes a long time. For all recent entries, we should have curr_platform set. Since the reset_ts is fairly recent, I can make this change.

 def _find_platform_users(platform):
-   return edb.get_timeseries_db().find({'metadata.platform': platform}).distinct(
-       'user_id')
+#   return edb.get_timeseries_db().find({'metadata.platform': platform}).distinct(
+#       'user_id')
+   return edb.get_profile_db().find({"curr_platform": platform}).distinct("user_id")

With this, the initial determination of the uuids is fast, and the dry-run completed without any failures. Woo!

shankari commented 7 years ago

Looking through the dry-run logs, I note several uuids with no place entries. Who are all these people, why are they signing up for e-mission, but not actually using it in any way?

INFO:root:resetting user 662d93fa-e150-43ff-b11d-e1d2facaec1b to ts 1496793600
INFO:root:resetting user d98ee40f-9c39-4388-ae66-757251d14eba to ts 1496793600
INFO:root:resetting user 3f1960fd-bfaa-4032-9562-19f5ca244f3f to ts 1496793600
INFO:root:resetting user d6717ba4-352d-4901-a503-f73c51811091 to ts 1496793600
INFO:root:resetting user cd710286-ff58-4a7d-93bd-31e305d14c73 to ts 1496793600
INFO:root:resetting user 71991ef1-fc9d-4fbe-83fb-1ed0434f324a to ts 1496793600
INFO:root:resetting user 8bd673ab-318b-4cfa-add8-fc71390e3500 to ts 1496793600
INFO:root:resetting user b566100c-b519-4267-be98-8f4bb045261c to ts 1496793600
INFO:root:resetting user f928898f-0450-40c1-b98c-ed6901135b55 to ts 1496793600
INFO:root:resetting user 176f0a89-d927-42ae-9a14-8079818eb45d to ts 1496793600
INFO:root:resetting user 4d663395-ddc6-4ca7-8d69-aa039d7d886c to ts 1496793600
INFO:root:resetting user 9e75449b-d105-40d5-9b12-9cff475252f1 to ts 1496793600
INFO:root:resetting user 6d265af9-6185-43f0-80d1-0d5ce2173cda to ts 1496793600
INFO:root:resetting user 4a66843c-985e-4a9e-a690-87830bcb007c to ts 1496793600

There doesn't appear to be a big rush of installs, but they are trickling in, a few a month.

[datetime.datetime(2017, 3, 21, 10, 51, 9, 971000),
 datetime.datetime(2017, 4, 6, 5, 34, 28, 165000),
 datetime.datetime(2017, 4, 12, 22, 7, 24, 955000),
 datetime.datetime(2017, 4, 14, 20, 54, 39, 612000),
 datetime.datetime(2017, 4, 18, 5, 20, 20, 262000),
 datetime.datetime(2017, 4, 18, 5, 30, 30, 44000),
 datetime.datetime(2017, 4, 24, 6, 30, 27, 691000),
 datetime.datetime(2017, 5, 1, 11, 3, 6, 781000),
 datetime.datetime(2017, 5, 1, 22, 34, 25, 997000),
 datetime.datetime(2017, 5, 13, 2, 19, 28, 333000),
 datetime.datetime(2017, 5, 20, 13, 2, 19, 154000),
 datetime.datetime(2017, 5, 22, 23, 48, 14, 516000),
 datetime.datetime(2017, 6, 6, 2, 35, 40, 555000),
 datetime.datetime(2017, 6, 18, 5, 7, 11, 212000)]
shankari commented 7 years ago

I should probably follow up with these people and see if they are legit and the app didn't work for them or whether they are not legit.

shankari commented 7 years ago

Ok, running without dry-run

$ ./e-mission-py.bash bin/reset_pipeline.py -p android -d 2017-06-07 2>&1 | tee /tmp/reset_pipeline_real.log
....
INFO:root:About to delete 7 pipeline states
INFO:root:this is not a dry run, result of removing analysis objects = {u'ok': 1, u'n': 4048}
INFO:root:this is not a dry run, result of removing pipeline states = {u'ok': 1, u'n': 7}
Namespace(all=False, date='2017-06-07', dry_run=False, email_list=None, platform='android', user_list=None)

Finished with no errors.

shankari commented 7 years ago

Next, I need to reset the pipeline for myself to December 10th to apply the fix to https://github.com/e-mission/e-mission-server/issues/457

First run with dry_run

$ ./e-mission-py.bash bin/reset_pipeline.py -u <redacted> -d 2016-12-10 -n
DEBUG:root:trip_seg_update_pipeline_query = {'$set': {'last_processed_ts': 1481313996.457}}
INFO:root:resetting 1 trip_seg_pipeline states for user <redacted> to 1481313996.46
DEBUG:root:reset_pipeline_query = {'last_processed_ts': {'$gt': 1481313696.457}, 'pipeline_stage': {'$ne': 1}, 'user_id': UUID('<redacted>')}
DEBUG:root:update_pipeline_query = {'$set': {'last_processed_ts': 1481313696.457}}
INFO:root:out of 7 total, resetting 4 pipeline states for user <redacted>
INFO:root:this is a dry run, returning from reset_pipeline_state without modifying anything

Next run without dry_run

$ ./e-mission-py.bash bin/reset_pipeline.py -u <redacted> -d 2016-12-10
INFO:root:out of 7 total, resetting 4 pipeline states for user ea59084e-11d4-4076-9252-3b9a29ce35e0 to 1481313696.46
DEBUG:root:this is not a dry run, result of updating trip_segmentation stage in reset_pipeline_state = {'updatedExisting': True, u'nModified': 1, u'ok': 1, u'n': 1}
DEBUG:root:this is not a dry run, result of updating all other stages in reset_pipeline_state = {'updatedExisting': True, u'nModified': 4, u'ok': 1, u'n': 4}
shankari commented 7 years ago

Now, restart the pipeline manually

$ PYTHONPATH=. /home/ubuntu/anaconda/bin/python bin/intake_multiprocess.py 3 >> /mnt/logs/emission/intake.stdinout 2>&1
shankari commented 7 years ago

No errors so far, will check log and phones tomorrow morning. Maybe this is finally done!

shankari commented 7 years ago

This is not too serious because it is only in the stage to generate outputs, but I get the following error for several

Traceback (most recent call last):
  File "/mnt/e-mission/e-mission-server/emission/net/usercache/builtin_usercache_handler.py", line 101, in storeViewsToCache
    self.storeTimelineToCache(time_query)
  File "/mnt/e-mission/e-mission-server/emission/net/usercache/builtin_usercache_handler.py", line 168, in storeTimelineToCache
    (start_ts, pydt.datetime.utcfromtimestamp(start_ts).isoformat()))
TypeError: a float is required

The relevant code is

        start_ts = esp.get_complete_ts(self.user_id)
        logging.debug("start ts from pipeline = %s, %s" %
           (start_ts, pydt.datetime.utcfromtimestamp(start_ts).isoformat()))
        trip_gj_list = self.get_trip_list_for_seven_days(start_ts)

from which we can infer that the start_ts is None.

We get the complete_ts from

def get_complete_ts(user_id):
    return get_current_state(user_id, ps.PipelineStages.CLEAN_RESAMPLING).last_processed_ts

So this implies that the CLEAN_RESAMPLING state has a last_processed_ts = None.

Let us see if this is true. Yup, it is!

In [577]: list(edb.get_pipeline_state_db().find({"user_id": UUID("7c6d50fd-efbc-4438-9f36-f73411ac3a3c")}))
Out[577]:
[{u'_id': ObjectId('589a848d88f66351eb985cd7'),
  u'curr_run_ts': None,
  u'last_processed_ts': None,
  u'last_ts_run': 1499268837.42303,
  u'pipeline_stage': 6, (ACCURACY_FILTERING)
  u'user_id': UUID('7c6d50fd-efbc-4438-9f36-f73411ac3a3c')}, 
 {u'_id': ObjectId('589a848d88f66351eb985cd9'),
  u'curr_run_ts': None,
  u'last_processed_ts': None,
  u'last_ts_run': 1499268837.450148,
  u'pipeline_stage': 1, (TRIP_SEGMENTATION)
  u'user_id': UUID('7c6d50fd-efbc-4438-9f36-f73411ac3a3c')},
 {u'_id': ObjectId('589a848d88f66351eb985cdb'),
  u'curr_run_ts': None,
  u'last_processed_ts': None,
  u'last_ts_run': 1499268837.648338,
  u'pipeline_stage': 2, (SECTION_SEGMENTATION)
  u'user_id': UUID('7c6d50fd-efbc-4438-9f36-f73411ac3a3c')},
 {u'_id': ObjectId('589a848e88f66351eb985cdd'),
  u'curr_run_ts': None,
  u'last_processed_ts': None,
  u'last_ts_run': 1499268837.697028,
  u'pipeline_stage': 3, (JUMP_SMOOTHING)
  u'user_id': UUID('7c6d50fd-efbc-4438-9f36-f73411ac3a3c')},
 {u'_id': ObjectId('589a848e88f66351eb985cdf'),
  u'curr_run_ts': None,
  u'last_processed_ts': None,
  u'last_ts_run': 1499268837.727734,
  u'pipeline_stage': 11, (CLEAN_RESAMPLING)
  u'user_id': UUID('7c6d50fd-efbc-4438-9f36-f73411ac3a3c')},
 {u'_id': ObjectId('589a848e88f66351eb985ce2'),
  u'curr_run_ts': None,
  u'last_processed_ts': None,
  u'last_ts_run': 1489931695.867638,
  u'pipeline_stage': 9, (OUTPUT_GEN)
  u'user_id': UUID('7c6d50fd-efbc-4438-9f36-f73411ac3a3c')}]

So basically, none of the stages have run successfully.

Let's see what data we have. Answer: None. All entries are stats, so we skip this user.

In [578]: edb.get_timeseries_db().find({"user_id": UUID('7c6d50fd-efbc-4438-9f36-f73411ac3a3c')}).count()
Out[578]: 1524

In [579]: edb.get_timeseries_db().find({"user_id": UUID('7c6d50fd-efbc-4438-9f36-f73411ac3a3c')}).distinct("metadata.key")
Out[579]: [u'stats/server_api_time', u'stats/pipeline_time']

And there's nothing in the usercache either. Basically, we have no data for this user.

In [580]: edb.get_usercache_db().find({"user_id": UUID('7c6d50fd-efbc-4438-9f36-f73411ac3a3c')}).count()
Out[580]: 1

In [581]: edb.get_usercache_db().find({"user_id": UUID('7c6d50fd-efbc-4438-9f36-f73411ac3a3c')}).distinct("metadata.key")
Out[581]: [u'common-trips']

In this case, there's no need to generate output either. Let's just check the start_ts and skip instead of generating an error.

shankari commented 7 years ago

Got another error. Before this, we have encountered this case when a user logged into two phones with the same ID.

[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] != [0.0, 0.195279647629855, 5.091991301009006, 6.376307295933423, 1.8666625417892948, 9.356544761901484, 0.8267522773714795, 0.8267531290795987, 4.74761677183518, 12.414261490574326]
Cleaning and resampling failed for user 68d16b1a-1521-4eaf-922a-20ba037cdc79
Traceback (most recent call last):
  File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 72, in clean_and_resample
    last_raw_place = save_cleaned_segments_for_ts(user_id, time_query.startTs, time_query.endTs)
  File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 98, in save_cleaned_segments_for_ts
    return save_cleaned_segments_for_timeline(user_id, tl)
  File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 124, in save_cleaned_segments_for_timeline
    (last_cleaned_place, filtered_tl) = create_and_link_timeline(tl, user_id, trip_map)
  File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 886, in create_and_link_timeline
    link_trip_start(ts, curr_cleaned_trip, curr_cleaned_start_place, raw_start_place)
  File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 940, in link_trip_start
    ts, cleaned_trip_data, cleaned_start_place_data)
  File "/mnt/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 1063, in _fix_squished_place_mismatch
    assert False
AssertionError
shankari commented 7 years ago

No further issues. Once I investigate the AssertionError, this issue can be closed

shankari commented 7 years ago

Also, note that this didn't really give new habitica points. In particular, the autocheck task would be marked as completed upto whatever date. We would need to reset that as well. Should I add that to the pipeline reset code? Or have a separate script?

Since the habitica points are not idempotent, we don't want to always reset the pipeline - only when we haven't been giving points all along. So let's make it a separate script for now.

shankari commented 7 years ago

Also, I checked the test phone, and Manonmani's and Tom's phones, and none of them seem to have data from the past week properly recorded. But there were no errors!

Debugging Tom's phone now...

shankari commented 7 years ago

Here's the run.

Couple of things to note:

And there is no data after 16th, so there is no processing. Why is there no data after the 16th? Was tracking turned off? Need phone logs.

2017-07-04 05:04:27,458:INFO:140503730042688:**********UUID 96af3842-d5fb-4f13-aea0-726efaeba6ea: moving to long term**********
2017-07-04 05:04:27,482:DEBUG:140503730042688:result = {'updatedExisting': True, u'nModified': 1, u'ok': 1, u'n': 1} after updating read timestamp
2017-07-04 05:04:27,699:DEBUG:140503730042688:Found 234 messages in response to query {'$or': [{'metadata.type': 'message'}, {'metadata.type': 'sensor-data'}, {'metadata.type': 'rw-document'}], 'user_id': UUID('96af3842-d5fb-4f13-aea0-726efaeba6ea')}
2017-07-04 05:04:27,702:INFO:140503730042688:For stage PipelineStages.USERCACHE, start_ts = 2017-06-16T20:18:24.227000
2017-07-04 05:04:27,705:DEBUG:140503730042688:result = {'updatedExisting': False, u'nModified': 0, u'ok': 1, u'n': 0} after updating read timestamp
2017-07-04 05:04:27,732:DEBUG:140503730042688:Found 0 messages in response to query {'$or': [{'metadata.type': 'message'}, {'metadata.type': 'sensor-data'}, {'metadata.type': 'rw-document'}], 'metadata.write_ts': {'$lte': 1499144662.702402, '$gte': 1497644304.227}, 'user_id': UUID('96af3842-d5fb-4f13-aea0-726efaeba6ea')}
2017-07-04 05:04:27,733:INFO:140503730042688:In moveToLongTerm, no entries to save
2017-07-04 05:04:27,733:DEBUG:140503730042688:Deleting all entries for query <emission.storage.timeseries.timequery.TimeQuery object at 0x7fc973fe2950>
2017-07-04 05:04:27,733:DEBUG:140503730042688:About to delete messages matching query {'$or': [{'metadata.type': 'message'}, {'metadata.type': 'sensor-data'}, {'metadata.type': 'rw-document'}], 'metadata.write_ts': {'$lte': 1499144662.702402, '$gte': 1497644304.227}, 'user_id': UUID('96af3842-d5fb-4f13-aea0-726efaeba6ea')}
2017-07-04 05:04:27,735:DEBUG:140503730042688:Delete result = {u'ok': 1, u'n': 0}
2017-07-04 05:04:27,737:INFO:140503730042688:For stage PipelineStages.USERCACHE, last_ts_processed is unchanged
2017-07-04 05:04:27,741:DEBUG:140503730042688:insert called
2017-07-04 05:04:27,741:DEBUG:140503730042688:entry was fine, no need to fix it
2017-07-04 05:04:27,742:DEBUG:140503730042688:Inserting entry Entry({'data': {'reading': 0.2814149856567383, 'name': 'USERCACHE', 'ts': 1499144667.73971}, '_id': ObjectId('595b21db88f66372747f37aa'), 'user_id': UUID('96af3842-d5fb-4f13-aea0-726efaeba6ea'), 'metadata': Metadata({'write_fmt_time': '2017-07-03T22:04:27.739878-07:00', 'write_ts': 1499144667.739878, 'time_zone': 'America/Los_Angeles', 'platform': 'server', 'write_local_dt': LocalDate({'hour': 22, 'month': 7, 'second': 27, 'weekday': 0, 'year': 2017, 'timezone': 'America/Los_Angeles', 'day': 3, 'minute': 4}), 'key': 'stats/pipeline_time'})}) into timeseries
2017-07-04 05:05:10,583:INFO:140503730042688:**********UUID 96af3842-d5fb-4f13-aea0-726efaeba6ea: filter accuracy if needed**********
2017-07-04 05:05:10,586:INFO:140503730042688:For stage PipelineStages.ACCURACY_FILTERING, start_ts is None
2017-07-04 05:05:10,589:DEBUG:140503730042688:Not a public phone, must already have filtered data, early return
2017-07-04 05:05:10,591:INFO:140503730042688:For stage PipelineStages.ACCURACY_FILTERING, last_ts_processed is unchanged
2017-07-04 05:05:10,594:DEBUG:140503730042688:insert called
2017-07-04 05:05:10,594:DEBUG:140503730042688:entry was fine, no need to fix it
2017-07-04 05:05:10,595:DEBUG:140503730042688:Inserting entry Entry({'data': {'reading': 0.009174108505249023, 'name': 'ACCURACY_FILTERING', 'ts': 1499144710.592516}, '_id': ObjectId('595b220688f66372747f37ab'), 'user_id': UUID('96af3842-d5fb-4f13-aea0-726efaeba6ea'), 'metadata': Metadata({'write_fmt_time': '2017-07-03T22:05:10.592669-07:00', 'write_ts': 1499144710.592669, 'time_zone': 'America/Los_Angeles', 'platform': 'server', 'write_local_dt': LocalDate({'hour': 22, 'month': 7, 'second': 10, 'weekday': 0, 'year': 2017, 'timezone': 'America/Los_Angeles', 'day': 3, 'minute': 5}), 'key': 'stats/pipeline_time'})}) into timeseries
2017-07-04 05:05:10,597:INFO:140503730042688:**********UUID 96af3842-d5fb-4f13-aea0-726efaeba6ea: segmenting into trips**********
2017-07-04 05:05:10,599:INFO:140503730042688:For stage PipelineStages.TRIP_SEGMENTATION, start_ts = 2017-06-16T20:18:23.917000
2017-07-04 05:05:10,601:DEBUG:140503730042688:curr_query = {'$or': [{'metadata.key': 'background/filtered_location'}], 'metadata.write_ts': {'$lte': 1499144705.599898, '$gte': 1497644303.917}, 'user_id': UUID('96af3842-d5fb-4f13-aea0-726efaeba6ea')}, sort_key = metadata.write_ts
2017-07-04 05:05:10,601:DEBUG:140503730042688:orig_ts_db_keys = ['background/filtered_location'], analysis_ts_db_keys = []
2017-07-04 05:05:10,601:DEBUG:140503730042688:finished querying values for ['background/filtered_location']
2017-07-04 05:05:10,602:DEBUG:140503730042688:finished querying values for []
2017-07-04 05:05:14,378:DEBUG:140503730042688:Found 0 results
2017-07-04 05:05:14,378:DEBUG:140503730042688:len(loc_df) == 0, early return
2017-07-04 05:05:14,381:INFO:140503730042688:For stage PipelineStages.TRIP_SEGMENTATION, last_ts_processed is unchanged
2017-07-04 05:05:14,384:DEBUG:140503730042688:insert called
2017-07-04 05:05:14,384:DEBUG:140503730042688:entry was fine, no need to fix it
2017-07-04 05:05:14,385:DEBUG:140503730042688:Inserting entry Entry({'data': {'reading': 3.785299062728882, 'name': 'TRIP_SEGMENTATION', 'ts': 1499144714.3827}, '_id': ObjectId('595b220a88f66372747f37ac'), 'user_id': UUID('96af3842-d5fb-4f13-aea0-726efaeba6ea'), 'metadata': Metadata({'write_fmt_time': '2017-07-03T22:05:14.382881-07:00', 'write_ts': 1499144714.382881, 'time_zone': 'America/Los_Angeles', 'platform': 'server', 'write_local_dt': LocalDate({'hour': 22, 'month': 7, 'second': 14, 'weekday': 0, 'year': 2017, 'timezone': 'America/Los_Angeles', 'day': 3, 'minute': 5}), 'key': 'stats/pipeline_time'})}) into timeseries

There are other files that have this UUID - let's see if they have the answers.

shankari commented 7 years ago

Did moveToLongTerm from the fix_usercache_processing.py break something?

shankari commented 7 years ago

There are other files that have this UUID - let's see if they have the answers.

Other files are from older runs. The run tracked by this reset is the one above.

shankari commented 7 years ago

Let's look at the reset code. It should have been to 6th June.

INFO:root:resetting user 96af3842-d5fb-4f13-aea0-726efaeba6ea to ts 1496793600
DEBUG:root:Looking for last place before 1496793600
DEBUG:root:last place doc for user 96af3842-d5fb-4f13-aea0-726efaeba6ea = {u'user_id': UUID('96af3842-d5fb-4f13-aea0-726efaeba6ea'),
 u'_id': ObjectId('5937b1be88f6633693cb7464'),
 u'data': {u'enter_fmt_time': u'2017-06-06T12:57:56.402000-07:00',
 u'display_name': u'South Shoreline Boulevard, Mountain View',
 u'exit_fmt_time': u'2017-06-06T18:27:21.074421-07:00',
 u'ending_trip': ObjectId('5937b19388f6633693cb7394'),
 u'starting_trip': ObjectId('5937b19488f6633693cb73a4'),
 u'enter_ts': 1496779076.402,
 u'duration': 19764.672421216965,
 u'raw_places': [ObjectId('5937b18d88f6633693cb733c'),
 ObjectId('5937b18d88f6633693cb733c')],
 u'exit_ts': 1496798841.0744212},
 u'metadata': {u'write_fmt_time': u'2017-06-07T00:56:46.733161-07:00',
 u'key': u'analysis/cleaned_place'}}
DEBUG:root:last trip doc for user 96af3842-d5fb-4f13-aea0-726efaeba6ea = None
DEBUG:root:first_raw_place = Entry({u'data': {u'enter_fmt_time': u'2017-06-06T12:57:56.402000-07:00',
 u'exit_fmt_time': u'2017-06-06T18:38:36.388000-07:00',
 u'ending_trip': ObjectId('5937b18d88f6633693cb733b'),
 u'starting_trip': ObjectId('5937b18d88f6633693cb733d'),
 u'enter_ts': 1496779076.402,
 u'duration': 20439.986000061035,
 u'exit_ts': 1496799516.388},
 u'_id': ObjectId('5937b18d88f6633693cb733c'),
 u'user_id': UUID('96af3842-d5fb-4f13-aea0-726efaeba6ea'),
 u'metadata': {u'write_fmt_time': u'2017-06-07T00:55:57.613423-07:00',
 u'key': u'segmentation/raw_place'}})
DEBUG:root:last_place_enter_ts = 1496779076.4 (2017-06-06T19:57:56.400000+00:00)
DEBUG:root:reset_ts = 1496779076.4 (2017-06-06T19:57:56.400000+00:00)
shankari commented 7 years ago

OK so this is the real reset. The other one seems to be the run on the 4th. Started in June as expected.

2017-07-05 16:33:22,372:INFO:140522484148032:**********UUID 96af3842-d5fb-4f13-aea0-726efaeba6ea: moving to long term**********
2017-07-05 16:33:22,399:DEBUG:140522484148032:result = {'updatedExisting': True, u'nModified': 1, u'ok': 1, u'n': 1} after updating read timestamp
2017-07-05 16:33:22,943:DEBUG:140522484148032:Found 2108 messages in response to query {'$or': [{'metadata.type': 'message'}, {'metadata.type': 'sensor-data'}, {'metadata.type': 'rw-document'}], 'user_id': UUID('96af3842-d5fb-4f13-aea0-726efaeba6ea')}
2017-07-05 16:33:22,945:INFO:140522484148032:For stage PipelineStages.USERCACHE, start_ts = 2017-06-06T19:57:56.402000

Copied a bunch of motion_activity entries (as expected).

2017-07-05 16:33:22,943:DEBUG:140522484148032:Found 2108 messages in response to query {'$or': [{'metadata.type': 'message'}, {'metadata.type': 'sensor-data'}, {'metadata.type': 'rw-document'}], 'user_id': UUID('96af3842-d5fb-4f13-aea0-726efaeba6ea')}
2017-07-05 16:33:23,216:DEBUG:140522484148032:Found 1881 messages in response to query {'$or': [{'metadata.type': 'message'}, {'metadata.type': 'sensor-data'}, {'metadata.type': 'rw-document'}], 'metadata.write_ts': {'$lte': 1499272397.945277, '$gte': 1496779076.402}, 'user_id': UUID('96af3842-d5fb-4f13-aea0-726efaeba6ea')}
2017-07-05 16:33:23,216:DEBUG:140522484148032:module_name = emission.net.usercache.formatters.android.motion_activity
2017-07-05 16:33:23,218:DEBUG:140522484148032:write_ts = 1497115303.38
2017-07-05 16:33:23,219:DEBUG:140522484148032:module_name = emission.net.usercache.formatters.android.motion_activity
2017-07-05 16:33:23,220:DEBUG:140522484148032:write_ts = 1497115303.46
2017-07-05 16:33:23,221:DEBUG:140522484148032:module_name = emission.net.usercache.formatters.android.motion_activity
2017-07-05 16:33:23,222:DEBUG:140522484148032:write_ts = 1497115333.28

Ran through the rest of the pipeline, and ended up with trips upto the 16th again. It looks like we didn't get any data uploaded after the 16th? I'm going to look at the test phone as well, and then start looking at phone logs.

2017-07-05 16:34:04,743:INFO:140522484148032:**********UUID 96af3842-d5fb-4f13-aea0-726efaeba6ea: segmenting into trips**********
2017-07-05 16:34:04,745:INFO:140522484148032:For stage PipelineStages.TRIP_SEGMENTATION, start_ts = 2017-06-06T20:02:56.402000
...
2017-07-05 16:43:26,963:DEBUG:140522484148032:Considering trip 595d16d788f6634e616b6edc: 2017-06-16T07:56:31-07:00 -> 2017-06-16T07:59:27-07:00 
2017-07-05 16:43:26,964:DEBUG:140522484148032:Considering trip 595d16d788f6634e616b6ee5: 2017-06-16T08:13:35.705292-07:00 -> 2017-06-16T10:35:00-07:00 
2017-07-05 16:43:26,964:DEBUG:140522484148032:Considering trip 595d16db88f6634e616b700b: 2017-06-16T10:44:10.215876-07:00 -> 2017-06-16T13:14:03-07:00 
shankari commented 7 years ago

For Tom and Manonmani, this is because tracking was turned off, although the tracking slider showed that it was on. I have documented the bug with the tracking slider in https://github.com/e-mission/e-mission-phone/issues/238

The only one left now is the second test phone. I'm going to fix the habitica timestamp before doing any further investigation.

shankari commented 7 years ago

wrt https://github.com/e-mission/e-mission-server/issues/497#issuecomment-313427524, another reason for making a separate script is that not all tasks will use timestamps as part of their state.

shankari commented 7 years ago

Habitica timestamps fixed in #507

shankari commented 7 years ago

Next, run the new habitica timestamp reset pipeline

./e-mission-py.bash bin/ext_service/reset_habitica_timestamps.py -p android -- 2017-06-07

and separately for me

./e-mission-py.bash bin/ext_service/reset_habitica_timestamps.py -u ea59084e-11d4-4076-9252-3b9a29ce35e0 -- 2016-12-10

Doesn't appear to have any errors!