All the scivision examples including the CEFAS plankton one seem to have image collections in a single zipfile.
In theory you can use a wildcard in a url_path but for me this throws an s3 error on the directory listing. I'm not sure whether this is a permissions issue or not actually a feature of intake.
At the moment, it's somewhat moot due to:
The images as they come out of the decollage process have different dimensions, and dask DataArray isn't happy with that
So for prototyping purposes intake is returning a CSV with file locations in s3, but we lose the benefit of the neatly packaged ImageSource - one to bear in mind when refactoring the image preprocessing and improving the metadata. Also worth requesting an object store from JASMIN for testing purposes, to experiment with the bucket policies.
All the scivision examples including the CEFAS plankton one seem to have image collections in a single zipfile.
In theory you can use a wildcard in a
url_path
but for me this throws an s3 error on the directory listing. I'm not sure whether this is a permissions issue or not actually a feature ofintake
.At the moment, it's somewhat moot due to:
So for prototyping purposes
intake
is returning a CSV with file locations in s3, but we lose the benefit of the neatly packaged ImageSource - one to bear in mind when refactoring the image preprocessing and improving the metadata. Also worth requesting an object store from JASMIN for testing purposes, to experiment with the bucket policies.