e-mission / e-mission-docs

Repository for docs and issues. If you need help, please file an issue here. Public conversations are better for open source projects than private email.
https://e-mission.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
15 stars 34 forks source link

Improve iOS smoothing of activities #166

Open shankari opened 8 years ago

shankari commented 8 years ago

In particular, we currently create a lot of very small segments. This leads to spurious heatmaps, confused users, and problems with cleaning. For example, some of these very small segments can have zero or 1 point in them. If there are zero points, we skip the raw_section creation, but we do create the section even if there is one point.

It is unclear what a 1 section point gives us, but it turns out that we cannot interpolate it:

  File "/Users/shankari/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 83, in save_cleaned_segments_for_timeline
    filtered_trip = get_filtered_trip(ts, trip)
  File "/Users/shankari/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 131, in get_filtered_trip
    section_map[section.get_id()] = get_filtered_section(filtered_trip_entry, section)
  File "/Users/shankari/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 198, in get_filtered_section
    with_speeds_df = get_filtered_points(section, filtered_section_data)
  File "/Users/shankari/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 267, in get_filtered_points
    resampled_loc_df = resample(filtered_loc_list, interval=30)
  File "/Users/shankari/e-mission/e-mission-server/emission/analysis/intake/cleaning/clean_and_resample.py", line 320, in resample
    fill_value='extrapolate')
  File "/Users/shankari/OSS/anaconda/lib/python2.7/site-packages/scipy/interpolate/interpolate.py", line 481, in __init__
    "least %d entries" % minval)
ValueError: x and y arrays must have at least 2 entries
shankari commented 8 years ago

We can also get zero point sections. These were non-zero in the segmentation but were deleted later as part of smoothing (wait, what?!)

2016-04-26 01:12:07,695:DEBUG:Getting filtered points for section Entry({u'user_id': UUID
('e471711e-bd14-3dbe-80b6-9c7d92ecc296'), u'_id': ObjectId('57104c6f88f6632e8c4d6f93'), u
'data': {u'user_id': UUID('e471711e-bd14-3dbe-80b6-9c7d92ecc296'), u'sensed_mode': 2, u's
tart_loc': {u'type': u'Point', u'coordinates': [-122.2585649, 37.8755499]}, u'end_ts': 14
60680775, u'start_ts': 1460680760.115, u'start_fmt_time': u'2016-04-14T17:39:20.115000-07
:00', u'end_loc': {u'type': u'Point', u'coordinates': [-122.2591605, 37.875379]}, u'sourc
e': u'SmoothedHighConfidenceMotion', u'end_fmt_time': u'2016-04-14T17:39:35-07:00', u'end
_local_dt': {u'hour': 17, u'month': 4, u'second': 35, u'weekday': 3, u'year': 2016, u'tim
ezone': u'America/Los_Angeles', u'day': 14, u'minute': 39}, u'duration': 14.8849999904632
57, u'_id': ObjectId('57104c6f88f6632e8c4d6f93'), u'trip_id': ObjectId('57104c6f88f6632e8
c4d6f91'), u'start_local_dt': {u'hour': 17, u'month': 4, u'second': 20, u'weekday': 3, u'
year': 2016, u'timezone': u'America/Los_Angeles', u'day': 14, u'minute': 39}}, u'metadata
': {u'write_fmt_time': u'2016-04-25T06:35:14.355957-07:00', u'write_ts': 1461591314.35595
7, u'time_zone': u'America/Los_Angeles', u'platform': u'server', u'write_local_dt': {u'ho
ur': 6, u'month': 4, u'second': 14, u'weekday': 0, u'year': 2016, u'timezone': u'America/
Los_Angeles', u'day': 25, u'minute': 35}, u'key': u'segmentation/raw_section'}})
2016-04-26 01:12:07,721:DEBUG:curr_query = {'$or': [{'metadata.key': 'background/filtered
_location'}], 'user_id': UUID('e471711e-bd14-3dbe-80b6-9c7d92ecc296'), 'data.ts': {'$lte'
: 1460680775, '$gte': 1460680760.115}}, sort_key = data.ts
2016-04-26 01:12:07,721:DEBUG:orig_ts_db_keys = ['background/filtered_location'], analysi
s_ts_db_keys = []
2016-04-26 01:12:07,726:DEBUG:deleting 5 points from section points
2016-04-26 01:12:07,727:DEBUG:Resampling entry list Empty DataFrame
Columns: []
Index: [] of size 0