Similar to work done in #85 on the GeoTIFFDataPipeModule, this PR implements similar functionality in ClayDataModule to load GeoTIFF files from an s3 bucket. Plus a few more minor tweaks to align both LightningDataModules.
Implementation uses torchdata's S3FileLister to get the files, but instead of returning an iterator, a list is returned.
TODO:
[x] Allow ClayDataModule to load GeoTIFF files directly from s3
[x] Rename datacube["path"] to datacube["source_url"] to match #86
[x] Implement predict dataloader
[x] Add a unit test
Continuing on from #91, this PR is part 2/3 of working towards generating new embeddings from the model developed at #47.
Similar to work done in #85 on the GeoTIFFDataPipeModule, this PR implements similar functionality in ClayDataModule to load GeoTIFF files from an s3 bucket. Plus a few more minor tweaks to align both LightningDataModules.
Implementation uses
torchdata
's S3FileLister to get the files, but instead of returning an iterator, a list is returned.TODO:
datacube["path"]
todatacube["source_url"]
to match #86Continuing on from #91, this PR is part 2/3 of working towards generating new embeddings from the model developed at #47.