Closed tpoisot closed 3 years ago
WorldClim
, since that's the name of the source of the data. Do they have some kind of web API you tap into? I tend to prefer the Chelsa climatologies to the kriging-based WorldClim ones - you can download a raster, and then I have a simple raster package (https://github.com/mkborregaard/VerySimpleRasters.jl ) to extract values at coordinates.
They don't have an API, but it's easy to get the address to download it. I agree that Chelsa would probably be better (after having read a bit on how they reconstructed it). I also like the rasters package! I had something in progress, but it was not as full-featured.
I'm not sure we need to be hugely opinionated here... the official releases of Chelsa are available from DataDryad as far I'm aware, so could be accessed via DataDeps.jl, generating the script to download it using DataDepsGenerators.jl - as far as I understand those packages - and I'm sure a similar thing could be done for WorldClim. Then we could have a really useful package that could automate the process of accessing this kind of data (maybe using VerySimpleRasters.jl) called something like ClimateData.jl
perhaps? I haven't used DataDeps.jl
myself, but I saw the talk at JuliaCon last year and it looked very cool...
PS I'm not saying all of this needs to be done now for @tpoisot's immediate release, but if we thought this kind of thing was useful, it could have a more generic name, and the functionality could be slowly extended...
Sure - but I'm just trying to understand why that's smarter than just having a package that downloads (and possibly loads) the rasters? That could be very simple.
I like the idea of a function to just download the tiles, in a package called BioClimaticData.jl
-- we could have methods like data[x]
, and data[x, n]
, which would give an array of values, or the nth variable, and x
can be all sort of things (a GBIFRecord
, an EcoBase
object, an AbstractPosition
, ...)
@mkborregaard do you think VerySimpleRasters.jl
is already in a state where we can do this? This would be an important stepping stone towards very fun stuff.
It needs an inbuilt driver for GeoTIFF, and I don't want to depend on GDAL. That shouldn't be tough to write though. Do you know where the binary specification of the format is defined? Otherwise GDAL is MIT so I could port theirs.
How would your interface work with x
- as a trait? I feel like we might think more about inheritance, eg. linking ecobase objects and gbifrecords, possibly to AbstractPosition
. I feel like depending on the GeoInterface and having methods for AbstractPosition
and Vector{<:AbstractPosition}
and then extend GeoInterface.coordinates
so you would do data[coordinates(x)]
would be a cleaner design?
Ah - so GDAL simply wraps the binary geotif format, which is a binary dependency in it's own right. That would of course be nice to avoid but the format is not simple to implement: https://www.geospatialworld.net/article/geotiff-a-standard-image-file-format-for-gis-applications/ In fact the geotiff C library is fairly big. So the answer is, no, VerySimpleRasters is currently not up to this task (it's big limitation is it only supports like 2 raster formats), and for now you might be better off with using GDAL directly, like you did in the bioclim example. BTW at the IBS there was a japanese research group telling me that they had initiated work on a comprehensive sdm package for julia.
It could be very simple, but as far as I recall from the talk, DataDeps.jl
handles two things you haven't mentioned:
chelsa doesn't have tiles :-/
We could ask @dirkkarger what the preferred way is to provide programmatic access to Chelsa data?
GeoTIFF uses a data compression we cannot memory map from Julia. So @tpoisot the best approach is probably to use your geotiff parser and accept the GDAL dep?
Since this repo is for the EcoJulia site, this thread can be closed.
I'm re-packaging my code to get the bioclim variables at coordinates as a package -- @mkborregaard (and also @richardreeve and @kescobo), do you prefer
BioClim
orWorldClim
as a package name? I feel likeBioClim
is close to the SDM model of the same name, but this should be aSDM.jl
package if we eventually go there.