e-mission / e-mission-docs

Repository for docs and issues. If you need help, please file an issue here. Public conversations are better for open source projects than private email.
https://e-mission.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
15 stars 34 forks source link

Loading the trip diary is slow #226

Closed shankari closed 3 years ago

shankari commented 7 years ago

It is very slow to load the trip diary. Even after the issue with the undefined, if I pull to refresh, it takes a while, even when loading data locally. Can we do some quick profiling and figure out what's going on?

shankari commented 7 years ago

So it looks like it is about 4 secs from state change to remote call invocation, and 4 secs for the loaded dashboard. Even this is slow, but bearable.

10-20 14:47:05.380    7720-7720/? I/chromium﹕ [INFO:CONSOLE(29)] "Finished changing state from {"url":"/splash","templateUrl":"templates/splash/splash.html","controller":"SplashCtrl","name":"splash"} to {"url":"/metrics","views":{"main-metrics":{"templateUrl":"templates/main-metrics.html","controller":"MetricsCtrl"}},"name":"root.main.metrics"}", source: file:///android_asset/www/js/controllers.js (29)

..... ton of garbage collection .....

10-20 14:47:10.466    7720-7720/? I/chromium﹕ [INFO:CONSOLE(356)] "Sending data {"freq":"D","start_time":1475712000,"end_time":1476835200,"metric":""}", source: file:///android_asset/www/js/metrics.js (356)
10-20 14:47:10.505    7720-7897/? W/PluginManager﹕ THREAD WARNING: exec() call to ServerComm.pushGetJSON blocked the main thread for 24ms. Plugin should use CordovaInterface.getThreadPool().
10-20 14:47:10.505    7720-7899/? I/System.out﹕ Posting data to https://e-mission.eecs.berkeley.edu/result/metrics/timestamp
10-20 14:47:10.505    7720-7899/? I/System.out﹕ Creating a new instance of the user profile
10-20 14:47:14.083    7720-7899/? I/CommunicationHelper﹕ Got response org.apache.http.message.BasicHttpResponse@42061de0 with status HTTP/1.1 200 OK
10-20 14:47:14.426    7720-7720/? I/chromium﹕ [INFO:CONSOLE(690)] "Running calculation with 44.04314481044127 and 60.95775699192045", source: file:///android_asset/www/js/metrics.js (690)

Now we start loading the diary. Looks like we had the data in 3 secs. And then the common trips reading took 15 secs, although it actually had no data!!

10-20 14:47:20.505    7720-7720/? I/chromium﹕ [INFO:CONSOLE(29)] "Finished changing state from {"url":"/metrics","views":{"main-metrics":{"templateUrl":"templates/main-metrics.html","controller":"MetricsCtrl"}},"name":"root.main.metrics"} to {"url":"/diary","views":{"main-diary":{"templateUrl":"templates/diary/list.html","controller":"DiaryListCtrl"}},"name":"root.main.diary"}", source: file:///android_asset/www/js/controllers.js (29)
10-20 14:47:20.637    7720-7720/? I/chromium﹕ [INFO:CONSOLE(13)] "controller DiaryListCtrl called", source: file:///android_asset/www/js/diary/list.js (13)
10-20 14:47:20.645    7720-7720/? I/chromium﹕ [INFO:CONSOLE(372)] "About to show 'Reading from cache'", source: file:///android_asset/www/js/diary/services.js (372)

... some GC ...

10-20 14:47:23.364    7720-7897/? I/System.out﹕ data = [{"id":"5809136988f6631c7589c701"...
10-20 14:47:23.887    7720-7897/? W/PluginManager﹕ THREAD WARNING: exec() call to UserCache.getDocument blocked the main thread for 3240ms. Plugin should use CordovaInterface.getThreadPool().

... ...

10-20 14:47:39.145    7720-7897/? I/System.out﹕ data = {"common_trips":[],"user_id":{"$uuid":"ea59084e11d4407692523b9a29ce35e0"},"common_places":[]}
10-20 14:47:39.145    7720-7897/? W/PluginManager﹕ THREAD WARNING: exec() call to UserCache.getDocument blocked the main thread for 15232ms. Plugin should use CordovaInterface.getThreadPool().

After that, it took less than a second to hit the undefined error.

10-20 14:47:42.083    7720-7720/? I/chromium﹕ [INFO:CONSOLE(380)] "About to hide 'Reading from cache'", source: file:///android_asset/www/js/diary/services.js (380)
10-20 14:47:42.106    7720-7720/? I/chromium﹕ [INFO:CONSOLE(156)] "Got event with args {"from":"broadcast","status":"success"}", source: file:///android_asset/www/js/diary/list.js (156)

Let's look at the timeline on reload. It's the same thing - we spend 3 secs to read the current day and 15 secs to read the common trips and then less than a second to process the trips.

10-20 14:55:26.114    7720-7720/? I/chromium﹕ [INFO:CONSOLE(372)] "About to show 'Reading from cache'", source: file:///android_asset/www/js/diary/services.js (372)
10-20 14:55:29.380    7720-7897/? I/System.out﹕ data = [{"id":"5809136988f6631c7589c701"
10-20 14:55:29.801    7720-7897/? W/PluginManager﹕ THREAD WARNING: exec() call to UserCache.getDocument blocked the main thread for 3671ms. Plugin should use CordovaInterface.getThreadPool().

10-20 14:55:45.036    7720-7897/? I/System.out﹕ data = {"common_trips":[],"user_id":{"$uuid":"ea59084e11d4407692523b9a29ce35e0"},"common_places":[]}
10-20 14:55:45.051    7720-7897/? W/PluginManager﹕ THREAD WARNING: exec() call to UserCache.getDocument blocked the main thread for 15227ms. Plugin should use CordovaInterface.getThreadPool().

10-20 14:55:45.091    7720-7720/? I/chromium﹕ [INFO:CONSOLE(380)] "About to hide 'Reading from cache'", source: file:///android_asset/www/js/diary/services.js (380)
10-20 14:55:46.192    7720-7720/? I/chromium﹕ [INFO:CONSOLE(519)] "About to hide 'Processing trips'", source: file:///android_asset/www/js/diary/services.js (519)
shankari commented 7 years ago

This is pretty messed up. Our trip diary load time goes from ~ 4 secs to 15 secs because (apparently) we are waiting for the common trips to load. When they are not a huge component of our functionality. We should switch to something more similar to the dashboard - promises, separate loads, and filling in the common trip information after the load is complete instead of waiting for it.

That would also give us an easy fix for #111 - if there is any error in filling in the common information, it won't mess up the main diary load.

PatGendre commented 5 years ago

@shankari I confirm a loading time between 5 and 8 seconds, and a few users complained about it. I attach here an example with both webserver.log webserver console log extract for one trip. Plus other webserver logs, if useful. When I display the same day twice, I observe a display time of 1 second less, due to the usercache I suppose, but it would nice to get a still faster display time.

Even without diminishing the diary trip loading time, the diary UI could be improved, also, by showing in the datepicker only the dates for which the user has data. We envisaged this with Loic but he did not have time to complete it.

2019-09-12gettrips-extract.log.gz DISPLAY_DIARY-SAMPLE.2019-08-08.gz

shankari commented 4 years ago

starting with the extract first, there are four related calls

The get_complete_ts call executes pretty quickly

2019-09-12 14:20:58,437:DEBUG:140331429369600:START POST /pipeline/get_complete_ts
2019-09-12 14:20:58,718:DEBUG:140331429369600:retUUID = bd8aeff5-ec66-43b0-8962-cef1ca23b86b
2019-09-12 14:20:58,720:DEBUG:140331429369600:Returning complete_ts = 1568201632.0
2019-09-12 14:20:58,721:DEBUG:140331429369600:END POST /pipeline/get_complete_ts bd8aeff5-ec66-43b0-8962-cef1ca23b86b 0.2834904193878174

The /timeline/getTrips/ call takes ~ 2 seconds to complete on the server, most of which is database query time.

2019-09-12 14:20:58,337:DEBUG:140331637413632:START POST /timeline/getTrips/2019-09-06
2019-09-12 14:21:00,917:DEBUG:140331637413632:trip count = 3, geojson count = 2
2019-09-12 14:21:00,917:DEBUG:140331637413632:type(ret_geojson) = <class 'list'>
2019-09-12 14:21:00,917:DEBUG:140331637413632:type(ret_dict) = <class 'dict'>
2019-09-12 14:21:00,926:DEBUG:140331637413632:END POST /timeline/getTrips/2019-09-06 bd8aeff5-ec66-43b0-8962-cef1ca23b86b 2.5892791748046875

The mode_confirm and purpose_confirm also take ~ 2 seconds each.


2019-09-12 14:21:01,295:DEBUG:140331479725824:START POST /datastreams/find_entries/timestamp
2019-09-12 14:21:03,078:DEBUG:140331479725824:Found 1 messages in response to query {'user_id': UUID('bd8aeff5-ec66-43b0-8962-cef1ca23b86b'), '$or': [{'metadata.key': 'manual/mode_confirm'}], 'metadata.write_ts': {'$lte': 1568325599, '$gte': 0}}
2019-09-12 14:21:03,301:DEBUG:140331479725824:END POST /datastreams/find_entries/timestamp bd8aeff5-ec66-43b0-8962-cef1ca23b86b 1.99631667137146 
2019-09-12 14:21:01,337:DEBUG:140331629020928:START POST /datastreams/find_entries/timestamp
2019-09-12 14:21:03,101:DEBUG:140331629020928:Found 1 messages in response to query {'user_id': UUID('bd8aeff5-ec66-43b0-8962-cef1ca23b86b'), '$or': [{'metadata.key': 'manual/purpose_confirm'}], 'metadata.write_ts': {'$lte': 1568325599, '$gte': 0}}
2019-09-12 14:21:03,310:DEBUG:140331629020928:END POST /datastreams/find_entries/timestamp bd8aeff5-ec66-43b0-8962-cef1ca23b86b 1.973224401473999 

However, they start about 3 seconds after the start of this call chain 14:20:58 -> 14:21:01 so by the time they end, it is ~ 5 mins from the start of the call chain 14:20:58 -> 14:21:03.

@PatGendre around 5 secs what you saw for this call?

PatGendre commented 4 years ago

it is ~ 5 mins from the start of the call chain 14:20:58 -> 14:21:03. @PatGendre around 5 secs what you saw for this call?

Thanks again 5 seconds (not 5 minutes ;-), this is the time I get on average for seeing the trips. I tested the display today, I took ca. 8 seconds (with wifi on). This may depend on whether the trip is cached.

Also, even with the same waiting time, the display / UX could be improved. There is a message saying "reading data from server" but it is displayed say only 2 seconds not during 5 seconds, so this is a little awkward for the user, who doesn't know when the actuel day asked for will be displayed (because the previous day stays displayed until the new one is).

shankari commented 4 years ago

5 seconds (not 5 minutes ;-),

Doh!

I took ca. 8 seconds (with wifi on)

Thanks, that is good to know.

There is a message saying "reading data from server" but it is displayed say only 2 seconds not during 5 seconds,

I think all of these are symptoms of the same problem, namely that we are reading the mode_confirm and purpose_confirm objects after reading the other information for the diary, and after turning off the "reading data from server" message

shankari commented 4 years ago

Yup. https://github.com/e-mission/e-mission-phone/blob/master/www/js/diary/services.js#L1012

however, the calls to the mode_confirm and purpose_confirm don't really use the values retrieved from the server so theoretically we can make all the calls at the same time. https://github.com/e-mission/e-mission-phone/blob/master/www/js/diary/services.js#L1015

shankari commented 4 years ago

@PatGendre the change is pretty simple but would need to be tested. I am also a little busy right now 🙁 Do you have a targeted ETA?

If you would like this sooner rather than later, I can outline a solution without much testing and maybe you could apply the patch to your client and test it out?

PatGendre commented 4 years ago

@shankari It's great that you identified the cause (and solution)! I remember that the end trip confirm timing was tricky (issue 413)

There is no real target. I will see with Yann if he can apply the patch but not sure as he's on urgent tasks this week. And next week I will be out of town, so this can wait until November 4, this is not a problem.

shankari commented 4 years ago

@asiripanich @atton16 I know that you have complained about the diary loading time. @PatGendre complained about it too, and he actually sent me the logs to debug 😃 It turns out that the issue is actually the additional latency from the calls to get the mode_confirm and purpose_confirm objects.

Given that you would also like to improve this, and this is related to code that you added, would you be open to fixing and testing it? The fix is pretty simple - you just need to include the calls to getUnifiedMessagesForInterval into the first call to the server. You can either

The first option has the advantage that the Reading from server loading message will be accurate, but will require some additional unpacking while processing the result.

Based on the uploaded logs, this should approximately halve the diary access time from ~ 5 secs to ~ 3 secs.

shankari commented 4 years ago

@asiripanich @atton16 any updates on whether you can tackle this?

asiripanich commented 4 years ago

Hi @shankari I will have @atton16 look at it soon. Thanks for the info.

atton16 commented 4 years ago

@asiripanich @atton16 I know that you have complained about the diary loading time. @PatGendre complained about it too, and he actually sent me the logs to debug 😃 It turns out that the issue is actually the additional latency from the calls to get the mode_confirm and purpose_confirm objects.

Given that you would also like to improve this, and this is related to code that you added, would you be open to fixing and testing it? The fix is pretty simple - you just need to include the calls to getUnifiedMessagesForInterval into the first call to the server. You can either

  • combine them in updateToServer using Promise.all and return a single promise from it OR
  • you could just add the two getUnifiedMessagesForInterval promises to the array in the current Promise.all([tripsFromServerPromise, isProcessingCompletePromise])

The first option has the advantage that the Reading from server loading message will be accurate, but will require some additional unpacking while processing the result.

Based on the uploaded logs, this should approximately halve the diary access time from ~ 5 secs to ~ 3 secs.

As you suggested, I can see that there is a promise chain starting from www/js/diary/services.js#L995 then we got another one at www/js/diary/services.js#L1014. This is an easy fix. I can do it right away by moving the calls to the first promise.

However, it sounded like the first option has the advantage but I cannot find any updateToServer. Can you point it out so I can have a look?

atton16 commented 4 years ago

I searched for update and found updateFromServer. I traced the calls and I think you mean this one. Perhaps, I will do it here.

atton16 commented 4 years ago

Checked, I am going with option 2 since updateFromServer executes at www/js/diary/services.js#L995 anyways.

shankari commented 4 years ago

However, it sounded like the first option has the advantage but I cannot find any updateToServer. Can you point it out so I can have a look?

Yes, it was updateFromServer.

Checked, I am going with option 2 since updateFromServer executes at www/js/diary/services.js#L995 anyways.

Yes, but you could have updateFromServer return a promise chain (which is also a promise) instead of a single promise. This would ensure that the Reading from server message is consistent with actually reading data from the server. But I will accept the alternative as well.

shankari commented 3 years ago

As part of a big push to make labeling easier for the CanBikeCO project, I implemented a idea that has been in my mind for a long time. Took just 3 days to get a hacked-together version in place because I guess all the ideas were fermenting for a while. While the changes are focused around making labeling easier, I think it could serve as a replacement for the diary screen - a next-gen version if you will.

The resulting code is at: https://github.com/e-mission/e-mission-phone/tree/add_labeling_screen And https://github.com/e-mission/e-mission-server/tree/add_labeling_screen

The screenshot is below.

Unlabeled Labeled Detail
label_screen_marked_up Screenshot_1610947363 Screenshot_1610947397

The main user-visible differences are:

The specific filters in the PR are specific to the ebike project, which is why I haven't merged it to master yet. But it is hopefully a good starting point for further discussion.

The main system-level differences are:

Some back of the envelope numbers on the effect of downsampling are at: https://github.com/e-mission/e-mission-phone/commit/f33184a5bbee3cc296d6b3f422b889bc6f8484f8

The screenshots are below:

Original. Downsampled to 500 pts Downsampled to 100 points
Screenshot_1610843387 Screenshot_1610843478 Screenshot_1610849327

You can see that the downsampling to 100 points definitely smooths out some details of the trajectory. But the rough sketch is still recognizably the same trip. This may be a problem if people want to zoom in, but we could load data dynamically if/when that happens. For now I have disabled zoom.

@asiripanich @PatGendre @jf87 @ericafenyo, Any thoughts on this approach? At some point, I will probably want to deploy this to a branch that you can play with using emTripLog.

Happy to move this to a new issue if this one is too complex.

shankari commented 3 years ago

Note that the changes are spread over multiple commits. I had a hard deadline of Sunday night, so I made the changes in stages, implemented functionality first and then worked through a series of performance and usability issues later.

In particular, this could be even faster if we could skip reading the place names on the first screen or, even better, populate them on the server and store them with the object. But my daughter (who was my ad-hoc usability tester) felt that there wasn't sufficient information without the place names - it was too confusing, specially as you went back in time.

asiripanich commented 3 years ago

Hi @shankari, what a great update! I really like the idea of infinite scroll, although I think the layout should be improved to make it less overwhelming. Personally, I prefer to see maps on the daily screen, but if this is where the bottle neck is, then it should be removed.

Some suggestions for UI improvement

shankari commented 3 years ago

@asiripanich i think we should able to re-introduce the maps, but just load them lazily and with lots of downsampling. I had to implement lazy load for the place names anyway - loading them all at the same time runs up against rate limits on nominatim. So we will first display the current batch of trips, and then the maps will slowly load while the user is interacting with them.

wrt your suggestions:

PatGendre commented 3 years ago

@shankari looks really great :-) it will really be a plus to speed up display, I hope the commits will soon be merged into master.

The only possible drawback would be for a user with a few months of history, there could be a way to go directly to a particular date (maybe in the top menu). But it's really a detail.

The left "mode" button is the analysed mode, and the middle "mode" button is the user labeled mode, right?

ericafenyo commented 3 years ago

@shankari Thanks for this update. I really love the idea of striping down the location data for the list since it would help reduce the loading time. To redisplay the map for each list-item as suggested by @asiripanich, It would be great to use images instead of a real map.

For the cards, we can add some paddings to push the content away from the edges.

shankari commented 3 years ago

@PatGendre

The left "mode" button is the analysed mode, and the middle "mode" button is the user labeled mode, right?

The left "mode" button is the analysed mode and the middle "mode" button is the one that it replaced. In other words, if you didn't have the first mode available, how would you have taken the trip? This is particularly interesting for the eBike project as we calculate the energy impact of the program. It is not in master, and I don't plan to move it there.

However, I also think that it might be interesting to other groups (if they chose to include it) as part of predicting unintended effects and resistance to change. For example, let's say that a city is considering making parking more difficult. If they collect data including this field before the intervention, and a lot of replacement modes for "car, drove alone" are "same mode", they might expect that people will continue to drive for a while after the change and increase congestion. If the replacement modes are "transit", then the change is likely to be smooth because people are already open to not driving.

shankari commented 3 years ago

It would be great to use images instead of a real map.

Yeah I guess we can pre-generate small static image thumbnails on the server and make them available for download. We should confirm that these will be smaller/faster than downloading the downsampled location data, although of course, we will save on downloading map tiles and generating markers on the phone.

However, I am not going to be able to tackle the "save as image feature" until at least March. if others ware willing take it on, that would be great. You would probably want to generate the image as part of a new pipeline step, and to save it directly in the DB for retrieval. There are some examples (e.g. https://stackoverflow.com/questions/53565979/export-a-folium-map-as-a-png) of how to do this in python using folium.

shankari commented 3 years ago

Looks like folium actually has a to_png method now as well (https://github.com/python-visualization/folium/pull/1015)

PatGendre commented 3 years ago

@shankari

The left "mode" button is the analysed mode and the middle "mode" button is the one that it replaced. In other words, if you didn't have the first mode available, how would you have taken the trip? This is particularly interesting for the eBike project as we calculate the energy impact of the program.

Thanks for the explanation, it's clearer to me, and an interesting feature (although not useful currently in our projet).

It is not in master, and I don't plan to move it there.

Ok, the "replace mode" button feature shall not be in master, but maybe the optimised diary with lazy loading could be pushed to master?

shankari commented 3 years ago

Yup. I have to finish writing a report on the mini-pilot first, and I think I want to get a basic dashboard working this week. Then I can push this to master.

shankari commented 3 years ago

@PatGendre @asiripanich @ericafenyo

Server PR: https://github.com/e-mission/e-mission-server/pull/795 Phone PR (with bonus refactoring of the user input code ensure DRY): https://github.com/e-mission/e-mission-phone/pull/736

I am going to close this issue now; we can open separate issues for other improvements to the infinite scroll screen