Closed quantumjot closed 2 years ago
Note urlpath
should be replaced by the full path of the directory in GDrive. The following catalog consists of two entries i) single image and ii) stack i.e. concatanate multiple images to a common image shape e.g. 256 x 256 pixels:
%%writefile catalog.yaml
sources:
plankton_single:
description: Load a single labeled images from CEFAS zooplankton dataset
origin:
driver: intake_xarray.image.ImageSource
parameters:
species:
description: which species to collect
type: str
default: Bivalvia-Larvae
id:
description: which filenmae
type: str
default: Pia1.2017-10-03.1726+N00296780_hc
args:
urlpath: '/content/gdrive/.../{{species}}/{{id}}.tif'
storage_options: {'anon': True}
plankton_all:
description: Labeled images from CEFAS zooplankton dataset
origin:
driver: intake_xarray.image.ImageSource
args:
urlpath: '/content/gdrive/.../{species}/{id}.tif'
storage_options: {'anon': True}
concat_dim: [id, species]
coerce_shape: [256, 256]
metadata:
shape: images_shape_all
Additional to the plankton example, here are three examples from the Environmental AI Book contributors using intake
for cataloguing files in different formats:
H5
format for wildfire analysis. The demonstrator includes a cell defining a customised intake
driver. geoTIFF
format for tree canopy delineation. The example uses an existing intake driver to fetch files from Zenodo repositories.csv
format for analysing ground sensor records. The example fetches tables from a Amazon bucket.I hope the above examples are useful to understand how intake could be beneficial for cataloguing and handling different formats for scivision
.
@acocac can this issue be closed?
yep, let me close it.