Closed davidbrochart closed 5 years ago
Hi @davidbrochart, looks like your path should start with gs://
not gcs://
. In theory this should work, but you may run into issues with google authentication... A simple work around for now is to grant public access for the bucket you are working with, so that each file has a http:// access point.
Also, careful with VRT files if you are running dask distributed, since I'm guessing each worker will need a local copy of the VRT!
make sure you are have the gdal library > 2.3 https://www.gdal.org/gdal_virtual_file_systems.html#gdal_virtual_file_systems_vsigs
and the latest versions of rasterio are simplifying the process of authentication https://github.com/mapbox/rasterio/pull/1577/files/4a441cbcd2beff6c5fe5acce5b54644cc56839d4
If you are comfortable with caching the data locally, that would be a way around any limitations that rasterio might have with acceptable file formats. To enable that add this to your catalog entry:
driver: rasterio
cache:
- argkey: urlpath
regex: 'pangeo-data'
type: file
Thanks @scottyhq. Rasterio v1.0.14 has not been released on conda yet, I think that's why it doesn't work yet. I won't access the dataset with dask, so that's not an issue. @jsignell, this works better, now I have an authentication error as @scottyhq mentioned, but that should be easy to solve. Thanks!
It works fine using @jsignell's caching mechanism. Closing the issue, thanks again!
Passing a remote path to the Rasterio driver doesn't seem to work, as
urlpath
is directly passed to Rasterio. I get the following error with this catalog: