nevrome / covid19germany

R package - Load, visualise and analyse daily updated data on the COVID-19 outbreak in Germany
Other
46 stars 8 forks source link

get_RKI_timeseries() results in error (downloaded length != reported length) #40

Closed wulms closed 3 years ago

wulms commented 3 years ago

Hello again,

when i am using the get_RKI_timeseries() function the download aborts before downloading the complete data. It stops with different amounts of data (e.g. 84 MB, 102 MB or 127 MB).

> covid19germany::get_RKI_timeseries()
Downloading file...
trying URL 'https://www.arcgis.com/sharing/rest/content/items/f10774f1c63e40168479a1feb6c7ca74/data'
Content type 'application/octet-stream' length 150136267 bytes (143.2 MB)
============================================
downloaded 127.6 MB

Error in utils::download.file(url, naked_download_file) : 
  download from 'https://www.arcgis.com/sharing/rest/content/items/f10774f1c63e40168479a1feb6c7ca74/data' failed
In addition: Warning messages:
1: In utils::download.file(url, naked_download_file) :
  downloaded length 133778752 != reported length 150136267
2: In utils::download.file(url, naked_download_file) :
  URL 'https://ago-item-storage.s3.us-east-1.amazonaws.com/f10774f1c63e40168479a1feb6c7ca74/RKI_COVID19.csv?X-Amz-Security-Token=IQoJb3JpZ2luX2VjEAgaCXVzLWVhc3QtMSJHMEUCIQDnGV5KK9AZZocpoUxrFCUKmbyATslUP1zImm2qyB55lAIgGiaTSEOtFw1n0PjsXoiZ8YwNAyGCAplG%2FTEshKfmYsoqvQMI0f%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FARAAGgw2MDQ3NTgxMDI2NjUiDC%2BuAksiQaJX%2FAz5ySqRA3Wf2syrSdL4409XHwCJHK%2BvTkhsL8XwZUY3BjxT82OtjpCXjEZRKqkZFhLLLNlMBoSIlURefK5su46nqdu1%2Fa0AaNAWWS7Yl3pNYGrRCwd%2BqZ2oKFPjcrUciRbIW23oRmuoZ4v%2BTP3I2yqA85enMO1CyoplzLtAWGz1JzD%2Fjd4MWjzoOyI0sJtVsuv8OW2J3%2F8aTqxoDvLcRtSE0Qlb5mGIjOTvYY%2FneK6QnZ58Fywr15DxWluIDU%2BpXOhLWWcX%2F%2B4Y3Zp3EuaijRNm%2BBZaeB1h2%2BKUQebIH%2BubXu2c0MxIVHJGc19bGLFzLF66cFk7CUfUNuvX1Qrea7AzezrUPkGmhmdw%2FVJGzet8%2FtMpYirbI%2BdZrXfCJZDrz%2FXjSvj9OJ2oaHLpRJ1KU83eV4U00U1XUrWp5VqyZbA5yB5WXrKH2W6znfyPJyvksCFKtW7FH0yRTEgL75RhgEOeqWYpumyE5PaEiQ2lAWeBJ52fJoDu0Ta1I5IMiqfnZQK68XOzq1gVZP379j6b%2FcBKp2gIfy5iMO%2FI%2Bv8FOusBXwuf8q9jWK1yTJzWq1TGqTDLLi7h4OBq2AEvy%2B5k4supOFQj7PXHh0ch5K18LwxWh [... truncated]

Is this a known error based on ARCGIS server capacity?

Best, Niklas

nevrome commented 3 years ago

Ja - that seems to become an issue. At the institute I can download it without any problems, but at home it fails for me as well. Maybe it's because we use the more reliable, alternative download link now by default?

Please try the download with the main url:

get_RKI_timeseries(url = "https://opendata.arcgis.com/datasets/dd4580c810204019a7b8eb3e0b329dd6_0.csv")

That gives me an identical table (at the moment!).

wulms commented 3 years ago

Thank you for the link. I can download it with my browser quite easily, but not with R (timeout after 60 seconds).

nevrome commented 3 years ago

Hm. Could you check if any combination of url = "https://www.arcgis.com/sharing/rest/content/items/f10774f1c63e40168479a1feb6c7ca74/data" or url = "https://opendata.arcgis.com/datasets/dd4580c810204019a7b8eb3e0b329dd6_0.csv" and options for the method argument in utils::download.file() works for you?

Maybe there is even a more reliable fie download function for R beyond utils::download.file()?

wulms commented 3 years ago

Hi, I solved the error. With setting the timeout manually. The original value is 60 seconds here.

options(timeout=120)

Maybe you can add this line somewhere in the script and choose a higher value, when needed.

Best, Niklas

nevrome commented 3 years ago

Done - thanks for looking into this!