This PR requires changes on the Lifewatch RStudio Server in order to work on there, I want users on the RStudio Server to be able to use the most recent etn version. If this PR is merged into v2.3.1, then this version will require an update of the R version on the lifewatch server.
This PR is serving as a jumping off point for testing by my co-developers. I'd be very grateful for your input!
Fixes:
I now use apache arrow for file transfers instead of RDS, which uses lz4 instead of gzip compression and is also chunked:
less memory usage for both client and server
faster compression/decompression
because we need less memory, we can get away with larger objects
Talking points:
Doesn't actually allow you to load detections for 2013_albertkanaal this crashes before serialisation on the server side, so I can't fix this on the client side
Is it actually faster for you?
You can test memory usage by looking at your resource manager, or via something like bench::mark()
Is the arrow dependency worth it? RDS will become more and more problematic with larger and larger datasets (especially multiple animal_project_codes)
Alternative approach
The API result object is currently passed as a single binary stream. Instead, I could also try to split it up into multiple files hosted by OpenCPU and to fetch those individually, and combine them. This is more complex and will be more difficult to debug if something goes wrong, but would allow us to keep using rds for now. I haven't tested this yet, as it would disrupt the current v2.3.0 beta. There is a dev env on the horizon that would allow tests like this in the future.
@sannegovaert , @peterdesmet , Could you have a look at this when you have time? You'll have to try it locally as the RStudio Server doesn't support R4.0 or Apache Arrow at the moment.
329 serves as an anchor for a version upgrade of the R version on the Lifewatch RStudio Server. This will open up many more recent R package versions to the users on the server. This will become more and more relevant as tidyverse drops support for R versions lower as 4.
This PR requires changes on the Lifewatch RStudio Server in order to work on there, I want users on the RStudio Server to be able to use the most recent etn version. If this PR is merged into v2.3.1, then this version will require an update of the R version on the lifewatch server.
This PR is serving as a jumping off point for testing by my co-developers. I'd be very grateful for your input!
Fixes:
Talking points:
2013_albertkanaal
this crashes before serialisation on the server side, so I can't fix this on the client sidebench::mark()
Alternative approach
The API result object is currently passed as a single binary stream. Instead, I could also try to split it up into multiple files hosted by OpenCPU and to fetch those individually, and combine them. This is more complex and will be more difficult to debug if something goes wrong, but would allow us to keep using rds for now. I haven't tested this yet, as it would disrupt the current v2.3.0 beta. There is a dev env on the horizon that would allow tests like this in the future.