Open PatGendre opened 2 years ago
@PatGendre I have not seen slowness on the CanBikeCO server. Can we start from there? Can you re-install the CanBikeCO app and see whether you still see slowness.
wrt your question about "it seems that undersampling the number of points to display so as to reduce the data to transmit from server to app would be a solution, but has it been implemented", it has been implemented on the label screen, but not on the diary screen. The label screen initially pulls only trip objects, and pulls the trajectory information only when the user clicks through to the map detail.
@shankari thanks, I reinstalled the app and yes the Diary display time is around 1-2 seconds so Ok. Maybe the longer display time could be to the size of the database (we have possibly several months of data)?
@PatGendre Ok, so now we can compare the servers. First, when did you last update the server? Just to make sure that we are on the same baseline... CanBikeCO servers also have ~ 6 months of data.
The server last update was on January 27 (2022)
The CanBikeCO server also has ~ 100 users with ~ 6 months of data, so I am not sure that amount of data is the problem. However, those users are split into multiple docker containers running on the single AWS server; maybe mongodb doesn't scale well?
>>> edb.get_uuid_db().count()
72
I've attached a log segment from the CEO staging server on a recent /timeline/getTrips/2022-01-24
call.
Can you compare it against a similar segment from your server and see where the slowdown is?
@shankari thanks ! A nice idea to show the user id and data fetch delay in the END POST :-) For today's log, we have 256 request, the median delay is 2.5s, but the maximum is 26s and there are more than 10% of gettrips requests above 5 seconds :-(
my impression is that the request delay is dependent on the quantity of data to send i.e. on the network but not sure.
2022-03-01 13:09:15,205:DEBUG:139932500743936:START POST /timeline/getTrips/2022-01-24
2022-03-01 13:09:15,283:DEBUG:139932500743936:END POST /timeline/getTrips/2022-01-24 912ebdd6-81ae-4909-bf40-63995eac4324 0.07860851287841797
So 2.5 secs already seems very slow.
Can you pull out the logs for a couple of the 5s+ calls? Let's see where the time is going...
@shankari thanks for your help, I extracted one 13s and one 26s log segment, I don't see really a bottleneck, the delay seems to be long just because the log is long (241 and 398 lines), with no evident step where the server spent a lot of time, there are many server queries and also (I guess) many sqlite inserts in the app, each one taking a little time.
But it is very plausible that mongodb is the principal slowdown 2diarylogs.zip cause, as the configuration is absolutely not optimised, may be we just need to add memory or something similar?
@PatGendre the database is definitely slower
*** Your database: query for cleaned_place = 85 ms
2022-03-01 09:36:31,870:DEBUG:139708365530880:finished querying values for [], count = 0
2022-03-01 09:36:31,870:DEBUG:139708365530880:In get_range_query, returning query {'data.enter_local_dt.year': {'$gte': 2022, '$lte': 2022}, 'data.enter_local_dt.month': {'$gte': 2, '$lte': 2}, 'data.enter_local_dt.day': {'$gte': 1, '$lte': 1}}
2022-03-01 09:36:31,955:DEBUG:139708365530880:finished querying values for ['analysis/cleaned_place'], count = 6
*** Our database: query for cleaned_place = 27 ms
2022-03-01 13:09:15,207:DEBUG:139932500743936:finished querying values for [], count = 0
2022-03-01 13:09:15,207:DEBUG:139932500743936:In get_range_query, returning query {'data.enter_local_dt.year': {'$gte': 2022, '$lte': 2022}, 'data.enter_local_dt.month': {'$gte': 1, '$lte': 1}, 'data.enter_local_dt.day': {'$gte': 24, '$lte': 24}}
2022-03-01 13:09:15,234:DEBUG:139932500743936:finished querying values for ['analysis/cleaned_place'], count = 0
And there are definitely lots of DB queries in the call. Need to see whether the DB queries alone can account for the slowdown.
cause, as the configuration is absolutely not optimised, may be we just need to add memory or something similar?
The mongodb containers on my system are running with around 2G each. So combined ~ 14 GB.
601322 systemd+ 20 0 3752308 2.0g 2688 S 0.7 6.5 4512:06 mongod
602972 systemd+ 20 0 3929548 2.3g 3852 S 0.7 7.3 16986:12 mongod
596535 systemd+ 20 0 4019264 2.3g 4628 S 0.3 7.5 19462:53 mongod
604132 systemd+ 20 0 4016748 2.3g 7640 S 0.3 7.3 80168:13 mongod
604359 systemd+ 20 0 3986592 2.3g 3256 S 0.3 7.3 20234:15 mongod
605127 systemd+ 20 0 3688736 2.0g 1420 S 0.3 6.5 2277:18 mongod
1383579 systemd+ 20 0 3937280 2.2g 5812 S 0.3 7.1 41048:56 mongod
@shankari FYI we doubled the memory size for mongo and the response time is now better 1.9s instead of 2.5 on average. Not spectular but better :-)
What is the current memory size for mongo? Just to have a sense of the sizes for others to provision...
I am not sure this is the exact size but I would say 12G, the server memory has been increased to 24G and here is what the top command for the mongod process gives:
top - 11:27:12 up 4 days, 1:14, 1 user, load average: 0,02, 0,05, 0,00
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
%Cpu(s): 2,5 us, 0,3 sy, 0,0 ni, 93,9 id, 0,5 wa, 0,0 hi, 2,7 si, 0,0 st
MiB Mem : 23703,2 total, 596,4 free, 12138,8 used, 10968,0 buff/cache
MiB Swap: 8189,0 total, 7759,2 free, 429,8 used. 11131,1 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
506 mongodb 20 0 12,4g 10,9g 17628 S 0,7 47,1 539:19.65 mongod
Also, in the mongo shell, I obtain as well 12G
db.serverStatus().mem
{ "bits" : 64, "resident" : 11158, "virtual" : 12683, "supported" : true }
We still have several users in la Rochelle finding the diary display time is too slow and frustrating especially when there is no data.
I've found other issues related to this but not sure if they are really solved, so I reopen a new issue. https://github.com/e-mission/e-mission-docs/issues/226 https://github.com/e-mission/e-mission-docs/issues/488
1) Especially annoying is the case when then is no data for the day to be displayed : the app still takes a few seconds to query the server, just for displaying a blank screen. Could there be a way to "tag" the days without data so as to make the display time much shorter ? (and it'd nice to display a "no data for this day" message in this case)
2) When there are data, looking at the closed issues, it seems that undersampling the number of points to display so as to reduce the data to transmit from server to app would be a solution, but has it been implemented ? (not clear to me). Also, how does the response time allocate between query time and data transfer time ? Intuitively the data transfer time is longer than the query time so a solution could be also to send compressed data? (but I did not measure the display / response time, sorry)