rivm-syso / Analyse-Together

A tool for analysis of air quality data, measured by low cost sensors
GNU General Public License v3.0
1 stars 0 forks source link

Make downloads more efficient #340

Open jspijker opened 7 months ago

jspijker commented 7 months ago

In current setup, multiple instances of the application might download the same data. Create some sort of communication between processes. Prevent blocking of data requests due to crashed downloads

here: {https://github.com/rivm-syso/Analyse-Together/blob/79f9ff9f0a6cb07a41b1eb505622bfb324b344b9/scripts/queue_manager.R#L83C3-L84C75)

jspijker commented 2 months ago

This is hard to prevent since equal station/range combo's are allowed in the queue. Simple solution is to randomize the station list to lower the change that the same station/range is downloaded twice.

jspijker commented 2 months ago

The following error may be related to this:

Error in ATdatabase::add_doc(type = "data_req_done", ref = job_id, doc = j_done,  : 
  Error add_doc: doc exists and overwrite is FALSE
jspijker commented 2 months ago

This is only relevant for LML/KNMI stations, since they can occur multiple times in a data request. Individual sensors should by unique. This can be optimised to get all LML/KNMI stations from within a data req first, and then the sensors.

LML and KNMI station are also downloaded individually, but doesn't work because the dl_station function is aimed at sensors, hence resulting in empty streams.

jspijker commented 1 month ago

again error: Happens when on heavy load.

Listening on http://0.0.0.0:3838
Error in ATdatabase::add_doc(type = "data_req_done", ref = job_id, doc = j_done, :
Error add_doc: doc exists and overwrite is FALSE
Execution halted
jspijker commented 5 days ago

Implement the solution of #392 into the dev branch, add it to the queue_manager and make it independent of the app