KWB-R / fhpredict

R Package for the Project Flusshygiene
https://kwb-r.github.io/fhpredict
MIT License
2 stars 0 forks source link

Request Timeouts #27

Closed ff6347 closed 4 years ago

ff6347 commented 5 years ago

This is a pretty serious problem we are facing.

I started a POST request to provide_rain_data_for_bathing_spot/json with the body

{
    "spot_id": 41,
    "user_id": 5
}

Which should work fine. Due to the fact that the referenced bathingspot has 166 measurements it the request takes a long time and we get a timeout. It is possible that the function still does its job but we will only know from the result in the DB.

This happens from postman.

I'll run a test from the frontend as well but I guess it will be the same or even worse. Currently I send a abort signal on unmount of the components that do the request (which is best practice).

My first search about that topic gives me this sto result. Which is not promissing 😿

I'll investigate a little further and will give feedback here.

hsonne commented 5 years ago

This function still downloads rain data for all days between the days of first and last measurements. I am currently changing this so that only rain data related to the actual days of measurements (and up to five days before) are loaded:

n_days_all             166
n_days_bathing_season  107
n_days_in_5day_ranges  379
n_radolan_files_new    379
n_radolan_files_old    996

In the example, there are 166 days for which measurements are available (n_days_all). From these days, only 107 are in the bathing season (n_days_bathing_season). The model requires rain data for these days and for the 5-day time periods before each of these days. This results in a total of 379 days for which rain data are required (n_days_in_5day_ranges). With the new approach, rain data will be loaded from exactly as many Radolan files (n_radolan_files_new). With the old approach, 996 Radolan files would have been downloaded and read (n_radolan_files_old).

The number of files to be downloaded and read will be reduced by 62 %. Hope this helps!

ff6347 commented 5 years ago

Its still 568,5MB of data to transfer. We really need to do more efficient but that is another issue.

I'm working on a additional service we need to bring in that works as broker between the frontend and the opencpu-api. I will configure it so it waits until the calculation is ready and then send a message to the frontend via websockets.

hsonne commented 5 years ago

I did the following improvements that are on the dev branch right now:

Your example call above was working on my personal computer:

>   system.time(fhpredict::provide_rain_data_for_bathing_spot(
+     user_id = 5, spot_id = 41, sampling_time = "1050"
+   ))
Getting URLs to files between 2011-04-28 and 2011-09-27 ... ok. (1.21s) 
Getting URLs to files between 2016-04-28 and 2016-09-28 ... ok. (1.14s) 
Getting URLs to files between 2018-04-27 and 2018-09-17 ... ok. (1.17s) 
Reading and cropping from raa01-sf_10000-1104281050-dwd---bin (1/379)...
Reading and cropping from raa01-sf_10000-1104291050-dwd---bin (2/379)...
Reading and cropping from raa01-sf_10000-1104301050-dwd---bin (3/379)...
[...]
Reading and cropping from raa01-sf_10000-1809151050-dwd---bin (377/379)...
Reading and cropping from raa01-sf_10000-1809161050-dwd---bin (378/379)...
Reading and cropping from raa01-sf_10000-1809171050-dwd---bin (379/379)...
Reading rain data from database ... ok. (4.35s) 
Converting time columns from text to POSIXct ... ok. (0.02s) 
Deleting rain data point with id 1729 ... ok. (0.10s) 
Deleting rain data point with id 1730 ... ok. (0.10s) 
Deleting rain data point with id 1731 ... ok. (0.10s) 
[...]
Deleting rain data point with id 2105 ... ok. (0.10s) 
Deleting rain data point with id 2106 ... ok. (0.08s) 
Deleting rain data point with id 2107 ... ok. (0.10s) 
A rain data record with id = 2108 has been inserted.
A rain data record with id = 2109 has been inserted.
A rain data record with id = 2110 has been inserted.
[...]
A rain data record with id = 2484 has been inserted.
A rain data record with id = 2485 has been inserted.
A rain data record with id = 2486 has been inserted.
   user  system elapsed 
 333.85   10.22  519.75 
ff6347 commented 5 years ago

Your example call above was working on my personal computer:

Ähm - don't get it. When making the call directly to the r package it should work fine. When making the call through the opencpu-api it will fail due to the timeout of 90s for the apache server the package is running on. I'm currently on a writing a middlelayer that sits between frontend and opencpu-api that handles long running calls for us. Should be integrated today.

hsonne commented 5 years ago

The call was crashing even on my computer due to lack of memory.

hsonne commented 5 years ago

There is a new release v0.2.0 on the master branch.

In this release, the function provide_rain_data_for_bathing_spot() does not import all rain data at once but in "blocks" of 10 Radolan files (can be modified using the blocksize argument). When being run for the first time, the function determines the metadata that are required to do the Radolan import (blocks of URLs to Radolan files, polygon coordinates, existing rain data in the database) and returns these information in a control object. This object can then be passed as the only argument to further calls of the function. Each time the function is called, one block of data is imported to the database and the element remaining that contains the number of remaining blocks to be downloaded is decreased by one. This allows to run provide_rain_data_for_bathing_spot() in a loop as in the following:

control <- fhpredict::provide_rain_data_for_bathing_spot(user_id = 5, spot_id = 41)

while (control$remaining > 0) {
  control <- fhpredict::provide_rain_data_for_bathing_spot(control = control)
}

I hope that this pattern can be applied in the JavaScript code that is run on the frontend. Using the default blocksize of 10 (Radolan files to import), each call of the function finishes after about 15 seconds.

ff6347 commented 5 years ago

I hope that this pattern can be applied in the JavaScript code that is run on the frontend. Using the default blocksize of 10 (Radolan files to import), each call of the function finishes after about 15 seconds.

Would mean I'll have to loop the call until the blocksize response is 0? Right?

I'm currently on a writing a middlelayer that sits between frontend and opencpu-api that handles long running calls for us. Should be integrated today.

This is nearly done. So we can have long running processes. We should have a call today.