Open behrica opened 6 years ago
As a follow up from #1 , let's maybe discuss this concrete data set.
JRC "Gridded Agro-Meteorological Data in Europe"
The data itself is available after registration and in "pieces", to be downloadable in CVS file. http://agri4cast.jrc.ec.europa.eu/DataPortal/Index.aspx
It requires registration and is asynchronous (via email notifications). This prevents to use via an R package directly.
There is a limit of "records", but with < 10 downloads the whole thing could be downloaded.
I agree that adding them to an R package does not make a lot of sense.
I checked the "reuse conditions" and I think we could re-publish the data (with proper citation, of course) into "http://zenodo.org"
Having done that would give it a DOI and "stable download links" for each file, in th form of : https://zenodo.org/record/7531/files/C3-EURO4M-MEDARE_TX.txt
Like this we could split the data in files per year, and the R code would use that links to access the data.
The same could work for all type of big climate data (as long as the re-use conditions allow the re-publishing).
I just downloaded a single year, and it is 503 MB as csv and 93 MB as compressed FST file(R package fst). So the whole data set would be 42 years * 100 MB = 4.2 GB, which is not a problem to store in Zenodo.
As a follow up from #1 , let's maybe discuss this concrete data set.
JRC "Gridded Agro-Meteorological Data in Europe"
The data itself is available after registration and in "pieces", to be downloadable in CVS file. http://agri4cast.jrc.ec.europa.eu/DataPortal/Index.aspx
It requires registration and is asynchronous (via email notifications). This prevents to use via an R package directly.
There is a limit of "records", but with < 10 downloads the whole thing could be downloaded.
I agree that adding them to an R package does not make a lot of sense.
I checked the "reuse conditions" and I think we could re-publish the data (with proper citation, of course) into "http://zenodo.org"
Having done that would give it a DOI and "stable download links" for each file, in th form of : https://zenodo.org/record/7531/files/C3-EURO4M-MEDARE_TX.txt
Like this we could split the data in files per year, and the R code would use that links to access the data.
The same could work for all type of big climate data (as long as the re-use conditions allow the re-publishing).