e-mission / e-mission-docs

Repository for docs and issues. If you need help, please file an issue here. Public conversations are better for open source projects than private email.
https://e-mission.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
15 stars 34 forks source link

loading diary is still slow (especially when there are no data for the day!) #707

Open PatGendre opened 2 years ago

PatGendre commented 2 years ago

We still have several users in la Rochelle finding the diary display time is too slow and frustrating especially when there is no data.

I've found other issues related to this but not sure if they are really solved, so I reopen a new issue. https://github.com/e-mission/e-mission-docs/issues/226 https://github.com/e-mission/e-mission-docs/issues/488

1) Especially annoying is the case when then is no data for the day to be displayed : the app still takes a few seconds to query the server, just for displaying a blank screen. Could there be a way to "tag" the days without data so as to make the display time much shorter ? (and it'd nice to display a "no data for this day" message in this case)

2) When there are data, looking at the closed issues, it seems that undersampling the number of points to display so as to reduce the data to transmit from server to app would be a solution, but has it been implemented ? (not clear to me). Also, how does the response time allocate between query time and data transfer time ? Intuitively the data transfer time is longer than the query time so a solution could be also to send compressed data? (but I did not measure the display / response time, sorry)

shankari commented 2 years ago

@PatGendre I have not seen slowness on the CanBikeCO server. Can we start from there? Can you re-install the CanBikeCO app and see whether you still see slowness.

wrt your question about "it seems that undersampling the number of points to display so as to reduce the data to transmit from server to app would be a solution, but has it been implemented", it has been implemented on the label screen, but not on the diary screen. The label screen initially pulls only trip objects, and pulls the trajectory information only when the user clicks through to the map detail.

PatGendre commented 2 years ago

@shankari thanks, I reinstalled the app and yes the Diary display time is around 1-2 seconds so Ok. Maybe the longer display time could be to the size of the database (we have possibly several months of data)?

shankari commented 2 years ago

@PatGendre Ok, so now we can compare the servers. First, when did you last update the server? Just to make sure that we are on the same baseline... CanBikeCO servers also have ~ 6 months of data.

PatGendre commented 2 years ago

The server last update was on January 27 (2022)

shankari commented 2 years ago

The CanBikeCO server also has ~ 100 users with ~ 6 months of data, so I am not sure that amount of data is the problem. However, those users are split into multiple docker containers running on the single AWS server; maybe mongodb doesn't scale well?

>>> edb.get_uuid_db().count()
72

I've attached a log segment from the CEO staging server on a recent /timeline/getTrips/2022-01-24 call. Can you compare it against a similar segment from your server and see where the slowdown is?

single_segment_timeline_call.log.gz

PatGendre commented 2 years ago

@shankari thanks ! A nice idea to show the user id and data fetch delay in the END POST :-) For today's log, we have 256 request, the median delay is 2.5s, but the maximum is 26s and there are more than 10% of gettrips requests above 5 seconds :-(

my impression is that the request delay is dependent on the quantity of data to send i.e. on the network but not sure.

shankari commented 2 years ago
  1. The fetch time in the END POST should be the time spent on the server, and independent of the network.
  2. In the call that I pulled out, the delay was 0.078 secs
    2022-03-01 13:09:15,205:DEBUG:139932500743936:START POST /timeline/getTrips/2022-01-24
    2022-03-01 13:09:15,283:DEBUG:139932500743936:END POST /timeline/getTrips/2022-01-24 912ebdd6-81ae-4909-bf40-63995eac4324 0.07860851287841797

    So 2.5 secs already seems very slow.

Can you pull out the logs for a couple of the 5s+ calls? Let's see where the time is going...

PatGendre commented 2 years ago

@shankari thanks for your help, I extracted one 13s and one 26s log segment, I don't see really a bottleneck, the delay seems to be long just because the log is long (241 and 398 lines), with no evident step where the server spent a lot of time, there are many server queries and also (I guess) many sqlite inserts in the app, each one taking a little time.

But it is very plausible that mongodb is the principal slowdown 2diarylogs.zip cause, as the configuration is absolutely not optimised, may be we just need to add memory or something similar?

shankari commented 2 years ago

@PatGendre the database is definitely slower

*** Your database: query for cleaned_place = 85 ms

2022-03-01 09:36:31,870:DEBUG:139708365530880:finished querying values for [], count = 0
2022-03-01 09:36:31,870:DEBUG:139708365530880:In get_range_query, returning query {'data.enter_local_dt.year': {'$gte': 2022, '$lte': 2022}, 'data.enter_local_dt.month': {'$gte': 2, '$lte': 2}, 'data.enter_local_dt.day': {'$gte': 1, '$lte': 1}}
2022-03-01 09:36:31,955:DEBUG:139708365530880:finished querying values for ['analysis/cleaned_place'], count = 6

*** Our database: query for cleaned_place = 27 ms

2022-03-01 13:09:15,207:DEBUG:139932500743936:finished querying values for [], count = 0
2022-03-01 13:09:15,207:DEBUG:139932500743936:In get_range_query, returning query {'data.enter_local_dt.year': {'$gte': 2022, '$lte': 2022}, 'data.enter_local_dt.month': {'$gte': 1, '$lte': 1}, 'data.enter_local_dt.day': {'$gte': 24, '$lte': 24}}
2022-03-01 13:09:15,234:DEBUG:139932500743936:finished querying values for ['analysis/cleaned_place'], count = 0

And there are definitely lots of DB queries in the call. Need to see whether the DB queries alone can account for the slowdown.

shankari commented 2 years ago

cause, as the configuration is absolutely not optimised, may be we just need to add memory or something similar?

The mongodb containers on my system are running with around 2G each. So combined ~ 14 GB.

 601322 systemd+  20   0 3752308   2.0g   2688 S   0.7   6.5   4512:06 mongod
 602972 systemd+  20   0 3929548   2.3g   3852 S   0.7   7.3  16986:12 mongod
 596535 systemd+  20   0 4019264   2.3g   4628 S   0.3   7.5  19462:53 mongod
 604132 systemd+  20   0 4016748   2.3g   7640 S   0.3   7.3  80168:13 mongod
 604359 systemd+  20   0 3986592   2.3g   3256 S   0.3   7.3  20234:15 mongod
 605127 systemd+  20   0 3688736   2.0g   1420 S   0.3   6.5   2277:18 mongod
1383579 systemd+  20   0 3937280   2.2g   5812 S   0.3   7.1  41048:56 mongod
PatGendre commented 2 years ago

@shankari FYI we doubled the memory size for mongo and the response time is now better 1.9s instead of 2.5 on average. Not spectular but better :-)

shankari commented 2 years ago

What is the current memory size for mongo? Just to have a sense of the sizes for others to provision...

PatGendre commented 2 years ago

I am not sure this is the exact size but I would say 12G, the server memory has been increased to 24G and here is what the top command for the mongod process gives:

top - 11:27:12 up 4 days,  1:14,  1 user,  load average: 0,02, 0,05, 0,00
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  2,5 us,  0,3 sy,  0,0 ni, 93,9 id,  0,5 wa,  0,0 hi,  2,7 si,  0,0 st
MiB Mem :  23703,2 total,    596,4 free,  12138,8 used,  10968,0 buff/cache
MiB Swap:   8189,0 total,   7759,2 free,    429,8 used.  11131,1 avail Mem 

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                         
   506 mongodb   20   0   12,4g  10,9g  17628 S   0,7  47,1 539:19.65 mongod 

Also, in the mongo shell, I obtain as well 12G

 db.serverStatus().mem 
{ "bits" : 64, "resident" : 11158, "virtual" : 12683, "supported" : true }