e-mission / e-mission-docs

Repository for docs and issues. If you need help, please file an issue here. Public conversations are better for open source projects than private email.
https://e-mission.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
15 stars 34 forks source link

Trip daily tab becomes unresponsive when there are many trips #461

Closed asiripanich closed 5 years ago

asiripanich commented 5 years ago

Hi @shankari

One of our users has reported that his trip daily for a particular day where he made a lot of trips is unresponsive that he couldn't scroll down on the tab at all. He uses a Samsang Galaxy 9+. Additionally, the UI elements in his profile tab are disorganised. Have any of the groups experienced something similar?

shankari commented 5 years ago

@PatGendre had a user with lots of trips (https://github.com/e-mission/e-mission-docs/issues/446#issuecomment-531122355). @PatGendre did the user say that the UX became unresponsive after you disabled storage to the usercache?

@asiripanich can the user email me their logs so I can see if there are any errors there that could correspond to the UI elements getting messed up?

shankari commented 5 years ago

@asiripanich any updates on this? If the user can't give us their logs, if we get a sense of how many is "a lot", we can use the fake trip generator to generate that many trips and then see how long it takes to load them.

Since @PatGendre's user had MB of data for the "lots of trips" case, I wonder if the issue is that your outgoing message is being truncated in some way.

asiripanich commented 5 years ago

Hi @shankari I will try to get the user's log by the next Tuesday. It's a long weekend here in Australia at the moment. :)

But I think synthetic trips should work too.

shankari commented 5 years ago

@asiripanich the synthetic trip generator is here https://github.com/e-mission/e-mission-server/blob/master/Data%20Generator%20-%20Demo.ipynb

Couple of caveats:

This was working as of the end of Spring 2020; if you run into any unexpected issues, report them here and I'll respond ASAP.

Thanks to @alvinalexander for contributing the trip generation code!

asiripanich commented 5 years ago

@shankari thanks again the test notebook.

@atton16 could you try this and let us know.

PatGendre commented 5 years ago

@PatGendre did the user say that the UX became unresponsive after you disabled storage to the usercache?

No, the UX time response issue is another one, it was not related to the 2 days with too many trips and thus a too big document to be stored in the mongodb user cache. @shankari (sorry for replying, I did not see your question)

shankari commented 5 years ago

@asiripanich @atton16 actually I just remembered that the trip generator notebook assumes that the authentication on the server is skip, since we didn't want to deal with google authentication in python.

The long-term solution is to support google auth via python, which will also make it easier for users to export their own data.

Unfortunately, I don't have time to implement that integration this week. It looks like the standard google-auth library (https://google-auth.readthedocs.io/en/latest/) does not support for obtaining user credentials.

This library provides no support for obtaining user credentials, but does provide limited support for using user credentials. You can use libraries such as oauthlib to obtain the access token.

A short-term solution is set to up a temporary staging server with the same configuration as the production server but with authentication set to "skip". You would need to connect to this server using the devapp since the emTripLog only connects to your production server.

Sorry this is so complicated; authentication is always tricky, specially in python as part of OpenID Connect, which is designed for a web-based workflow. That's why I have put off the work so far :(

atton16 commented 5 years ago

I have tried the data generation demo.

It appears that the code does to push data to the server, but the data won't display on the devapp. I doubled check on the database, I see user data and trip data but none of the trips displayed on the app. The only things that works is signing with username fake_user_129.

Here are the trips displayed with the data generator code after pipeline ran.

  raw_trips['start_coord'] = raw_trips["start_loc"].apply(lambda x : dict(x)['coordinates'])
=== Trip: {"coordinates": [-122.4002, 37.77302], "type": "Point"} -> {"coordinates": [-122.27381, 37.87119], "type": "Point"}
  --- Section: {"coordinates": [-122.4002, 37.77302], "type": "Point"} -> {"coordinates": [-122.27381, 37.87119], "type": "Point"}  on  MotionTypes.IN_VEHICLE
=== Trip: {"coordinates": [-122.4002, 37.77302], "type": "Point"} -> {"coordinates": [-122.14091, 37.42872], "type": "Point"}
  --- Section: {"coordinates": [-122.4002, 37.77302], "type": "Point"} -> {"coordinates": [-122.18635, 37.45734], "type": "Point"}  on  MotionTypes.IN_VEHICLE
  --- Section: {"coordinates": [-122.16473, 37.44352], "type": "Point"} -> {"coordinates": [-122.18234, 37.45482], "type": "Point"}  on  MotionTypes.AIR_OR_HSR
  --- Section: {"coordinates": [-122.16434, 37.44331], "type": "Point"} -> {"coordinates": [-122.16456, 37.44311], "type": "Point"}  on  MotionTypes.AIR_OR_HSR
  --- Section: {"coordinates": [-122.16456, 37.44311], "type": "Point"} -> {"coordinates": [-122.14023, 37.42804], "type": "Point"}  on  MotionTypes.BICYCLING
  --- Section: {"coordinates": [-122.14096, 37.42849], "type": "Point"} -> {"coordinates": [-122.14097, 37.4286], "type": "Point"}  on  MotionTypes.WALKING
  --- Section: {"coordinates": [-122.14097, 37.4286], "type": "Point"} -> {"coordinates": [-122.1414, 37.42903], "type": "Point"}  on  MotionTypes.BICYCLING
  --- Section: {"coordinates": [-122.1414, 37.42903], "type": "Point"} -> {"coordinates": [-122.14091, 37.42872], "type": "Point"}  on  MotionTypes.WALKING
shankari commented 5 years ago

@atton16 which server are you using? Please see https://github.com/e-mission/e-mission-docs/issues/461#issuecomment-539540629

atton16 commented 5 years ago

@atton16 which server are you using? Please see #461 (comment)

Staging server with devapp connect to it using dummy-auth

atton16 commented 5 years ago

Please note that I can still login use test data from test_july_22 just fine.

shankari commented 5 years ago

so if I understand correctly:

Is that correct? I suspect it is due to the pipeline. Note that the pipeline is run using a shell command !./e-mission-py.bash bin/debug/intake_single_user.py -e 'fake_user_129' which works fine if you test this against localhost, as the default does, but it won't work against a remote server.

you need to run the pipeline on the staging server

Note that the default printout of the trips also reads data from the local database

ts = esta.TimeSeries.get_time_series(fake_user_id)
entry_it = ts.find_entries(["analysis/cleaned_trip"], time_query=None)
shankari commented 5 years ago

you can also print the trips generated by fake trip generator before they are posted, around here

    temp = fake_user.take_trip()
    print('# of location measurements:', len(temp))
    measurements.append(temp)

to know what to expect wrt dates

atton16 commented 5 years ago

I can see that after I ran the pipeline, the pipeline state objects comes up like so.

Screen Shot 2019-10-09 at 02 01 52
shankari commented 5 years ago

so is the staging server running on your localhost? I thought it would be running in the cloud somewhere else :)

atton16 commented 5 years ago

so is the staging server running on your localhost?

I tunneled the mongo connection to localhost so I can connect robomongo for inspection. (just to get through the firewall)

shankari commented 5 years ago

ok so then I guess running the pipeline locally would work too :) Check the dates on those trips that you see, IIRC the data generator generates trips a year ago or something

atton16 commented 5 years ago

I think it use today as a date. Regardless of time different, trips still not displaying when I am picking the Oct 7, 8 and 9.

Screen Shot 2019-10-09 at 02 07 53
shankari commented 5 years ago

@atton16, that entry is for the server API stats (stats/server_api_time). You need to find the location or cleanedtrip or cleaned section entries. Please see https://github.com/e-mission/e-mission-server/blob/master/Timeseries_Sample.ipynb for an example of how to access other streams or look at the last two cells in the DataGenerator to see how it retrieves the trips and print out trip details.

atton16 commented 5 years ago

I think I found 2 trips on the database.

Screen Shot 2019-10-09 at 02 18 46
shankari commented 5 years ago

so look at their dates. What you have expanded here is the metadata, not the data

atton16 commented 5 years ago

Oh I think it's future time, in my timezone.

Screen Shot 2019-10-09 at 02 21 24 Screen Shot 2019-10-09 at 02 23 19

**It's 2AM here

shankari commented 5 years ago

That is 2018. As I said

the data generator generates trips a year ago or something

Here's the relevant code https://github.com/e-mission/e-mission-server/blob/ba8f1bb0653d3b43d50fe05e4f4b64ef9c2bdc95/emission/simulation/fake_user.py#L22

atton16 commented 5 years ago

Ugh, my bad. I am sorry...

atton16 commented 5 years ago

Ok, so I got 2 trips. Now I need to modify the generator to generate a lot of trips. I tried rerun everything many many times but it in the end I still got 2 cleaned trips.

shankari commented 5 years ago

No worries. It is still bad that the timezone is set to Los_Angeles. I think is because the fake generator doesn't set the timezone and it defaults to Los_Angeles in the formatter. You should be able to change that default by editing the code - LMK if that is important and I will send you the link.

atton16 commented 5 years ago

I think I can leave that part for now. What important is generating a lot of trips to see the scrolling problem.

shankari commented 5 years ago

I think that modifying the generator should be as easy as changing the range here

for _ in range(4):
    """

    """
    temp = fake_user.take_trip()
    print('# of location measurements:', len(temp))
    measurements.append(temp)
    #try:
        #respons = requests.post('server_address', payload={'data':measurements})
    #except:

    #sleep(2)

print('Path:',fake_user._path)
#Run pipeline

Note that the data is still for the same day one year ago, so if you have already run the pipeline once, the newly added data won't be processed. I would just delete the data (staging DB after all), bump up the number of trips and re-run.

atton16 commented 5 years ago

I changed 4 to 120 but the trips are generated on different dates.

Partial logs:

>> Traveling from home to family | Mode of transportation: CAR
*** Leg Start Time: 2018-11-06 13:46:00+00:00
# of location measurements: 243
>> Traveling from family to work | Mode of transportation: CAR
*** Leg Start Time: 2018-11-07 01:28:00+00:00
# of location measurements: 951
>> Traveling from work to family | Mode of transportation: CAR
*** Leg Start Time: 2018-11-07 13:48:00+00:00
# of location measurements: 870
>> Staying at family
# of location measurements: 0
>> Traveling from family to work | Mode of transportation: CAR
*** Leg Start Time: 2018-11-08 02:13:00+00:00
# of location measurements: 951
>> Traveling from work to family | Mode of transportation: CAR
*** Leg Start Time: 2018-11-08 14:33:00+00:00
# of location measurements: 870
>> Staying at family
# of location measurements: 0
>> Traveling from family to work | Mode of transportation: CAR
*** Leg Start Time: 2018-11-09 02:58:00+00:00
# of location measurements: 951
>> Staying at work
# of location measurements: 0
>> Traveling from work to family | Mode of transportation: CAR
*** Leg Start Time: 2018-11-09 15:18:00+00:00
# of location measurements: 870
>> Staying at family
# of location measurements: 0
>> Traveling from family to work | Mode of transportation: CAR
*** Leg Start Time: 2018-11-10 03:43:00+00:00
# of location measurements: 951
>> Traveling from work to home | Mode of transportation: BICYCLE,TRANSIT
*** Leg Start Time: 2018-11-10 17:05:11+00:00
*** Leg Start Time: 2018-11-10 17:07:00+00:00
*** Leg Start Time: 2018-11-10 18:22:01+00:00
# of location measurements: 352
>> Staying at home
# of location measurements: 0
>> Traveling from home to family | Mode of transportation: CAR
*** Leg Start Time: 2018-11-11 05:41:00+00:00
# of location measurements: 243
>> Traveling from family to work | Mode of transportation: CAR
*** Leg Start Time: 2018-11-11 17:23:00+00:00
# of location measurements: 951
>> Staying at work
# of location measurements: 0
>> Traveling from work to family | Mode of transportation: CAR
*** Leg Start Time: 2018-11-12 05:43:00+00:00
# of location measurements: 870
>> Staying at family
# of location measurements: 0
>> Staying at family
# of location measurements: 0
Path: ['home', 'family', 'work', 'home', 'family', 'work', 'family', 'work', 'home', 'family', 'home', 'family', 'work', 'home', 'work', 'home', 'work', 'family', 'work', 'home', 'family', 'work', 'family', 'work', 'home', 'work', 'family', 'work', 'family', 'work', 'family', 'work', 'family', 'work', 'family', 'work', 'family', 'work', 'family', 'work', 'family', 'work', 'home', 'family', 'work', 'family', 'work', 'family', 'work', 'family', 'work', 'family', 'work', 'family', 'work', 'home', 'work', 'home', 'family', 'work', 'family', 'work', 'family', 'work', 'family', 'work', 'home', 'family', 'work', 'family']
49766 entries were sucessfully synced to the server
shankari commented 5 years ago

hm let me look at the code a bit more.

shankari commented 5 years ago

I think this is because the trip generator assumes 3 hours between trips. I do see that the trips that you have are actually more than 3 hours apart. But that is the only update I can see about how far we make the trips. Try changing that to something like 15-20 mins (so that it will still be detected as a trip) and see what happens.

https://github.com/e-mission/e-mission-server/blob/ba8f1bb0653d3b43d50fe05e4f4b64ef9c2bdc95/emission/simulation/fake_user.py#L22

shankari commented 5 years ago

You could also add some logs around temp = fake_user.take_trip() to get the start and end timestamp for the newly added trip, just to get a better sense of what is going on.

shankari commented 5 years ago

@asiripanich got the logs. Which day should I be looking at? @atton16 any updates on using the trip generator?

atton16 commented 5 years ago

I ended up dumping data from production server to staging one.

I tested with the problematic user account with the reported date but cannot find any problem.

shankari commented 5 years ago

@asiripanich here's what I see from the logs

For a few days, all was fine and they were successfully able to retrieve 10 trips for 1568988000000 = 2019-09-21T00:00:00+10:00

/tmp/loggerDB.dms.withdate.log:187,1568942688.2729998,2019-09-19T18:24:48.273000-07:00,"js : while reading data for 1568901600000 from server, got nTrips = 0"
...
/tmp/loggerDB.dms.withdate.log:53674,1569126621.644,2019-09-21T21:30:21.644000-07:00,"js : while reading data for 1569074400000 from server, got nTrips = 1"
/tmp/loggerDB.dms.withdate.log:53852,1569126632.8639998,2019-09-21T21:30:32.864000-07:00,"js : while reading data for 1569074400000 from server, got nTrips = 1"
/tmp/loggerDB.dms.withdate.log:54209,1569126755.93,2019-09-21T21:32:35.930000-07:00,"js : while reading data for 1568988000000 from server, got nTrips = 10"
/tmp/loggerDB.dms.withdate.log:54510,1569126769.921,2019-09-21T21:32:49.921000-07:00,"js : while reading data for 1568988000000 from server, got nTrips = 10"
/tmp/loggerDB.dms.withdate.log:54811,1569126783.0410001,2019-09-21T21:33:03.041000-07:00,"js : while reading data for 1568988000000 from server, got nTrips = 10"
/tmp/loggerDB.dms.withdate.log:55115,1569126795.4889998,2019-09-21T21:33:15.489000-07:00,"js : while reading data for 1568988000000 from server, got nTrips = 10"
/tmp/loggerDB.dms.withdate.log:55416,1569126807.742,2019-09-21T21:33:27.742000-07:00,"js : while reading data for 1568988000000 from server, got nTrips = 10"

then, something went wrong with the connection to the server

/tmp/loggerDB.dms.withdate.log:74902,1569895041.861,2019-09-30T18:57:21.861000-07:00,"js : while reading data from server for 1569852000000 error = ""While pushing/getting from server HTTP/1.1 521 Origin Down"""
...
/tmp/loggerDB.dms.withdate.log:75820,1569895252.168,2019-09-30T19:00:52.168000-07:00,"js : while reading data from server for 1568901600000 error = ""While pushing/getting from server HTTP/1.1 521 Origin Down"""

and then it was fine again

/tmp/loggerDB.dms.withdate.log:75912,1569899248.25,2019-09-30T20:07:28.250000-07:00,"js : while reading data for 1568988000000 from server, got nTrips = 10"
...
/tmp/loggerDB.dms.withdate.log:77321,1569899558.115,2019-09-30T20:12:38.115000-07:00,"js : while reading data for 1568988000000 from server, got nTrips = 10"
atton16 commented 5 years ago

I think we are looking at September 21, 2019.

shankari commented 5 years ago

Given @atton16's report in https://github.com/e-mission/e-mission-docs/issues/461#issuecomment-540043439 maybe the issue was the 521 error from the server?

The 521 error appears to be related to the server being down or unreachable...

521 Web Server Is Down The origin server has refused the connection from Cloudflare.

shankari commented 5 years ago

Is the user still seeing the problem?

shankari commented 5 years ago

@asiripanich if the user is not seeing the problem now, can we close this issue?

asiripanich commented 5 years ago

I will have to follow up with the user but we can close this for now.