Closed SanderDevisscher closed 8 months ago
What do you think ?
Is it both for the private and public app? I think there are multiple options to speed up the startup of the app:
minimumSeatsAvailable: 1
. This is a new feature that is available on your ECS deployment. It basically prepares a container that has all the data loaded and 'waits' for the user to take it. Should be straightforward for the public app but will need some modification for the private app. Downside is that you will have permanently one container open and thus pay for it.profvis
. Given these results, we might consider different file storage (e.g. using arrow
), different data processing (e.g. using data.table
), preprocess the data and save summarized data on S3 (e.g. for each WBE) or some other changes. arrow
: I have used arrow
before for rectangular data files, but not for spatial data and in context of S3, so I can't predict how much time we will gain, but it is certainly worth a try given https://arrow.apache.org/docs/r/articles/fs.html and https://cran.r-project.org/web/packages/sfarrow But I would first want to identify why it is slow.Ok so you are saying if we switch to "new" hosting technology as currently used by the alien species portal & setting minimumSeatsAvailable: 1 should possibly speed up the app. I would like to test this first with the alien species portal first, which is seriously slow atm. @TheJenne18 also mentioned this when talking about ECS.
In case the implementation of ECS for reporting grofwild doesn't go as smooth as anticipated or gets delayed like such things tend to do I suggest walking a parallel path of code enhancement. Can you do an attempt to detect what is currently a bottleneck?
- Can you do an attempt to detect what is currently a bottleneck?
Sure, I can
I had a quick look using profvis
for the public app. Originally: time before results are displayed is 30s.
arrow
) for eco and geo data: gain 13s. Not much faster in loading data, but due to avoiding gc() for data processing - data is not in-memory. This is a serious gain, so I'll investigate how to do this for the spatial data (shapes and schade data)
The lack of speed when starting the app keeps causing displeasure. Do you have any experience with the arrow - package ? maybe it can speed up the start up ?
What do you think ?