e-mission / e-mission-docs

Repository for docs and issues. If you need help, please file an issue here. Public conversations are better for open source projects than private email.
https://e-mission.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
15 stars 34 forks source link

no metrics on the dashboard #343

Closed PatGendre closed 5 years ago

PatGendre commented 5 years ago

Hi, yesterday we have hacked (but not properly corrected) the "é" encoding issue identified in issue #333 so as to transfer the data from the phone to the server, this worked and we could execute the pipeline and have analysis data (such as cleaned and inferred sections), visible in the diary and in mongo. On the app however, no metrics are shown for the newly imported days of location data.

here is the log :

2019-03-19 11:05:18,880:DEBUG:140572423100160:START POST /result/metrics/timestamp
2019-03-19 11:05:18,881:DEBUG:140572423100160:methodName = skip, returning <class 'emission.net.auth.skip.SkipMethod'>
2019-03-19 11:05:18,881:DEBUG:140572423100160:Using the skip method to verify id token patgendre of length 9
2019-03-19 11:05:18,882:DEBUG:140572423100160:retUUID = 91bc554a-4228-4afd-b2ca-be8eb7e33760
2019-03-19 11:05:18,883:DEBUG:140572423100160:metric_list = ['duration', 'median_speed', 'count', 'distance']
2019-03-19 11:05:18,883:DEBUG:140572423100160:['duration -> <function get_duration at 0x7fd991e040d0>', 'median_speed -> <function get_median_speed at 0x7fd991e04158>', 'count -> <function get_count at 0x7fd991e7cf28>', 'distance -> <function get_distance at 0x7fd991e04048>']
2019-03-19 11:05:18,883:DEBUG:140572423100160:for user 91bc554a-4228-4afd-b2ca-be8eb7e33760, returning timeseries <emission.storage.timeseries.builtin_timeseries.BuiltinTimeSeries object at 0x7fd99050c470>
2019-03-19 11:05:18,886:DEBUG:140572423100160:curr_query = {'invalid': {'$exists': False}, 'user_id': UUID('91bc554a-4228-4afd-b2ca-be8eb7e33760'), '$or': [{'metadata.key': 'analysis/inferred_section'}], 'data.start_ts': {'$lte': 1551999600, '$gte': 1551654000}}, sort_key = data.start_ts
2019-03-19 11:05:18,887:DEBUG:140572115326720:START POST /result/metrics/timestamp
2019-03-19 11:05:18,888:DEBUG:140572115326720:metric_list = ['duration', 'median_speed', 'count', 'distance']
2019-03-19 11:05:18,889:DEBUG:140572115326720:['duration -> <function get_duration at 0x7fd991e040d0>', 'median_speed -> <function get_median_speed at 0x7fd991e04158>', 'count -> <function get_count at 0x7fd991e7cf28>', 'distance -> <function get_distance at 0x7fd991e04048>']
2019-03-19 11:05:18,889:DEBUG:140572115326720:for user None, returning timeseries <emission.storage.timeseries.aggregate_timeseries.AggregateTimeSeries object at 0x7fd9902c4e80>
2019-03-19 11:05:18,889:DEBUG:140572115326720:curr_query = {'invalid': {'$exists': False}, '$or': [{'metadata.key': 'analysis/inferred_section'}], 'data.start_ts': {'$lte': 1551999600, '$gte': 1551654000}}, sort_key = None
2019-03-19 11:05:18,889:DEBUG:140572115326720:orig_ts_db_keys = [], analysis_ts_db_keys = ['analysis/inferred_section']
2019-03-19 11:05:18,888:DEBUG:140572423100160:orig_ts_db_keys = [], analysis_ts_db_keys = ['analysis/inferred_section']
2019-03-19 11:05:19,112:DEBUG:140572115326720:finished querying values for [], count = 0
2019-03-19 11:05:19,114:DEBUG:140572115326720:finished querying values for ['analysis/inferred_section'], count = 19
2019-03-19 11:05:19,119:DEBUG:140572423100160:finished querying values for [], count = 0
2019-03-19 11:05:19,125:DEBUG:140572423100160:finished querying values for ['analysis/inferred_section'], count = 0
2019-03-19 11:05:19,306:DEBUG:140572423100160:orig_ts_db_matches = 0, analysis_ts_db_matches = 0
2019-03-19 11:05:19,315:DEBUG:140572115326720:orig_ts_db_matches = 0, analysis_ts_db_matches = 19
2019-03-19 11:05:19,469:DEBUG:140572423100160:Found 0 results
2019-03-19 11:05:19,469:DEBUG:140572423100160:Returning entry with length 0 result
2019-03-19 11:05:19,470:INFO:140572423100160:Found no entries for user 91bc554a-4228-4afd-b2ca-be8eb7e33760, time_query <emission.storage.timeseries.timequery.TimeQuery object at 0x7fd9a163fe10>
2019-03-19 11:05:19,470:DEBUG:140572423100160:END POST /result/metrics/timestamp 91bc554a-4228-4afd-b2ca-be8eb7e33760 0.5899531841278076 
2019-03-19 11:05:19,471:DEBUG:140572423100160:insert called with entry of type <class 'emission.core.wrapper.entry.Entry'>
2019-03-19 11:05:19,471:DEBUG:140572423100160:entry was fine, no need to fix it
2019-03-19 11:05:19,471:DEBUG:140572423100160:Inserting entry Entry({'_id': ObjectId('5c90bedf4e276dc4d28b291c'), 'user_id': UUID('91bc554a-4228-4afd-b2ca-be8eb7e33760'), 'metadata': Metadata({'key': 'stats/server_api_time', 'platform': 'server', 'write_ts': 1552989919.4707994, 'time_zone': 'America/Los_Angeles', 'write_local_dt': LocalDate({'year': 2019, 'month': 3, 'day': 19, 'hour': 3, 'minute': 5, 'second': 19, 'weekday': 1, 'timezone': 'America/Los_Angeles'}), 'write_fmt_time': '2019-03-19T03:05:19.470799-07:00'}), 'data': {'name': 'POST_/result/metrics/timestamp', 'ts': 1552989919.470431, 'reading': 0.5899531841278076}}) into timeseries
2019-03-19 11:05:19,477:DEBUG:140572115326720:Found 19 results
2019-03-19 11:05:19,479:DEBUG:140572423100160:insert called with entry of type <class 'emission.core.wrapper.entry.Entry'>
2019-03-19 11:05:19,479:DEBUG:140572423100160:entry was fine, no need to fix it
2019-03-19 11:05:19,480:DEBUG:140572423100160:Inserting entry Entry({'_id': ObjectId('5c90bedf4e276dc4d28b291d'), 'user_id': UUID('91bc554a-4228-4afd-b2ca-be8eb7e33760'), 'metadata': Metadata({'key': 'stats/server_api_time', 'platform': 'server', 'write_ts': 1552989919.4791946, 'time_zone': 'America/Los_Angeles', 'write_local_dt': LocalDate({'year': 2019, 'month': 3, 'day': 19, 'hour': 3, 'minute': 5, 'second': 19, 'weekday': 1, 'timezone': 'America/Los_Angeles'}), 'write_fmt_time': '2019-03-19T03:05:19.479195-07:00'}), 'data': {'name': 'POST_/result/metrics/timestamp_cputime', 'ts': 1552989919.470431, 'reading': 0.5897792791947722}}) into timeseries
2019-03-19 11:05:19,486:DEBUG:140572115326720:After de-duping, converted 19 points to 19 
2019-03-19 11:05:19,487:DEBUG:140572115326720:Returning entry with length 19 result
2019-03-19 11:05:19,491:DEBUG:140572115326720:first row is _id                                                 5c7e2c3f4e276d0fe265a5b3
cleaned_section                                     5c7e2c184e276d0fe265a49a
distance                                                             1259.09
distances                  [0.0, 38.73577858168578, 38.735870635001724, 3...
duration                                                             764.106
end_fmt_time                                2019-03-04T09:03:29.200000+01:00
end_loc                    {'type': 'Point', 'coordinates': [2.3025079, 4...
end_local_dt_day                                                           4
end_local_dt_hour                                                          9
end_local_dt_minute                                                        3
end_local_dt_month                                                         3
end_local_dt_second                                                       29
end_local_dt_timezone                                           Europe/Paris
end_local_dt_weekday                                                       0
end_local_dt_year                                                       2019
end_stop                                            5c7e2c184e276d0fe265a4c7
end_ts                                                           1.55169e+09
metadata_write_ts                                                1.55177e+09
sensed_mode                                                                1
source                                          SmoothedHighConfidenceMotion
speeds                     [0.0, 1.291192619389526, 1.2911956878333908, 1...
start_fmt_time                              2019-03-04T08:50:45.094058+01:00
start_loc                  {'type': 'Point', 'coordinates': [2.3089632, 4...
start_local_dt_day                                                         4
start_local_dt_hour                                                        8
start_local_dt_minute                                                     50
start_local_dt_month                                                       3
start_local_dt_second                                                     45
start_local_dt_timezone                                         Europe/Paris
start_local_dt_weekday                                                     0
start_local_dt_year                                                     2019
start_stop                                                               NaN
start_ts                                                         1.55169e+09
trip_id                                             5c7e2c174e276d0fe265a498
user_id                                 4b890037-7198-4d08-9baf-4661a2432533
Name: 0, dtype: object
2019-03-19 11:05:19,558:DEBUG:140572115326720:END POST /result/metrics/timestamp  0.6710357666015625 
shankari commented 5 years ago

Hi, yesterday we have hacked (but not properly corrected) the "é" encoding issue identified in issue #333 so as to transfer the data from the phone to the server,

@PatGendre did you do this by writing a script to parse the data from the usercacheDB and save it to the database? If so, would you consider contributing it to the server code? It is a nice workaround/first step for issue #326; people could collect data for a few days, email their usercacheDB to themselves and load it on a server running on their own laptop and experiment with it.

I know that at least one project (the cci project) wanted to download user data onto researcher laptops (they were recruiting in person). It turns out that you can't actually export the database into an externally accessed file on iOS, so we dropped it. But a secondary reason was that we also didn't have the ability to import directly from the usercacheDB. If they had the ability then, they may have chosen to come up with some approach that included users emailing their data to them or something.

shankari commented 5 years ago

wrt your actual issue, note that there are two calls to /result/metrics/timestamp. The first call, serviced in thread 140572423100160 was for your aggregate metrics. It returned no data.

2019-03-19 11:05:18,880:DEBUG:140572423100160:START POST /result/metrics/timestamp
2019-03-19 11:05:18,881:DEBUG:140572423100160:methodName = skip, returning <class 'emission.net.auth.skip.SkipMethod'>
2019-03-19 11:05:18,881:DEBUG:140572423100160:Using the skip method to verify id token patgendre of length 9
2019-03-19 11:05:18,882:DEBUG:140572423100160:retUUID = 91bc554a-4228-4afd-b2ca-be8eb7e33760
2019-03-19 11:05:18,883:DEBUG:140572423100160:metric_list = ['duration', 'median_speed', 'count', 'distance']
2019-03-19 11:05:18,883:DEBUG:140572423100160:['duration -> <function get_duration at 0x7fd991e040d0>', 'median_speed -> <function get_median_speed at 0x7fd991e04158>', 'count -> <function get_count at 0x7fd991e7cf28>', 'distance -> <function get_distance at 0x7fd991e04048>']
2019-03-19 11:05:18,883:DEBUG:140572423100160:for user 91bc554a-4228-4afd-b2ca-be8eb7e33760, returning timeseries <emission.storage.timeseries.builtin_timeseries.BuiltinTimeSeries object at 0x7fd99050c470>
2019-03-19 11:05:18,886:DEBUG:140572423100160:curr_query = {'invalid': {'$exists': False}, 'user_id': UUID('91bc554a-4228-4afd-b2ca-be8eb7e33760'), '$or': [{'metadata.key': 'analysis/inferred_section'}], 'data.start_ts': {'$lte': 1551999600, '$gte': 1551654000}}, sort_key = data.start_ts
2019-03-19 11:05:18,888:DEBUG:140572423100160:orig_ts_db_keys = [], analysis_ts_db_keys = ['analysis/inferred_section']
2019-03-19 11:05:19,119:DEBUG:140572423100160:finished querying values for [], count = 0
2019-03-19 11:05:19,125:DEBUG:140572423100160:finished querying values for ['analysis/inferred_section'], count = 0
2019-03-19 11:05:19,306:DEBUG:140572423100160:orig_ts_db_matches = 0, analysis_ts_db_matches = 0
2019-03-19 11:05:19,306:DEBUG:140572423100160:orig_ts_db_matches = 0, analysis_ts_db_matches = 0
2019-03-19 11:05:19,469:DEBUG:140572423100160:Found 0 results
2019-03-19 11:05:19,469:DEBUG:140572423100160:Returning entry with length 0 result
2019-03-19 11:05:19,470:INFO:140572423100160:Found no entries for user 91bc554a-4228-4afd-b2ca-be8eb7e33760, time_query <emission.storage.timeseries.timequery.TimeQuery object at 0x7fd9a163fe10>
2019-03-19 11:05:19,470:DEBUG:140572423100160:END POST /result/metrics/timestamp 91bc554a-4228-4afd-b2ca-be8eb7e33760 0.5899531841278076 

Are you sure the timestamps are correct? I believe they are in UTC, although since you are in France, they should be pretty close to your actual time.

shankari commented 5 years ago

According to the query, the range that you were searching for was

In [2]: arrow.get(1551999600).to("Europe/Paris")
Out[2]: <Arrow [2019-03-08T00:00:00+01:00]>

In [3]: arrow.get(1551654000).to("Europe/Paris")
Out[3]: <Arrow [2019-03-04T00:00:00+01:00]>

Can you look through your data and confirm that you have analysis/inferred_section objects in that range? The log has the concrete query that was run.

2019-03-19 11:05:18,886:DEBUG:140572423100160:curr_query = {'invalid': {'$exists': False}, 'user_id': UUID('91bc554a-4228-4afd-b2ca-be8eb7e33760'), '$or': [{'metadata.key': 'analysis/inferred_section'}], 'data.start_ts': {'$lte': 1551999600, '$gte': 1551654000}}, sort_key = data.start_ts
PatGendre commented 5 years ago

Hi, yesterday we have hacked (but not properly corrected) the "é" encoding issue identified in issue #333 so as to transfer the data from the phone to the server,

@PatGendre did you do this by writing a script to parse the data from the usercacheDB and save it to the database? If so, would you consider contributing it to the server code? It is a nice workaround/first step for issue #326; people could collect data for a few days, email their usercacheDB to themselves and load it on a server running on their own laptop and experiment with it.

Sorry, it was really a dirty hack, adding b = b.replace(b'\xe9', b'e') in bottle.py (just before return json_loads(b) line 1294) so as to avoid the json exception...

PatGendre commented 5 years ago

Hi, I have no access to the app right now because I use the UNSW version but I checked in mongo, there is effectively no inferred sections from March 4 to March 8, and more generally no data in the analysis DB until March 18. But there is data the timeseries DB in early march, including locations, but the pipeline apparently did not transform these data towards the analysis DB.

PatGendre commented 5 years ago

So maybe when we re-run the pipeline I'll check again in the logs why the location data does not translate into analysis data for the first half of March...

shankari commented 5 years ago

@PatGendre yes. As you saw from the pipeline documentation, the pipeline state keeps track of how far we have processed input data. If for some reason, your last_processed_ts was mid-March, for example, the earlier data would not be processed. You have to reset the pipeline for that to happen.

And of course, if that is indeed the case, figure out why your last_processed_ts was mid-March...

PatGendre commented 5 years ago

Ok, in the database the last_ts_run is on March 26 for all 8 stages and for the last_processed_ts, it is March 21, except for 6 (ACCURACY_FILTERING) and 9 (OUTPUT_GEN) for which it is null

shankari commented 5 years ago

@PatGendre were you able to re-run the pipeline? Or if you can't reproduce, maybe close this issue?

PatGendre commented 5 years ago

@shankari I close the issue, difficult to re-run the pipeline so as to reproduce the issue as the data have been accidentally erased from mongodb on the 27 (Russian attack) :-( I'll try to check in the coming days if the metrics seem Ok.