microsoft / torchgeo

TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
https://www.osgeo.org/projects/torchgeo/
MIT License
2.62k stars 322 forks source link

Reading S3File using RasterDataset #1165

Open alexandralorenzo opened 1 year ago

alexandralorenzo commented 1 year ago

Summary

Hello, First of all thanks for your work.

Could it be possible to add an option from reading files from S3 using RasterDataset?

class BDOrtho(RasterDataset):
    filename_glob = "*.jp2"
    is_image = True
    separate_files = True

TypeError: expected str, bytes or os.PathLike object, not S3File

Kind regards, Alexandra

Rationale

I need to create my own datasets reader

Implementation

No response

Alternatives

No response

Additional information

No response

adamjstewart commented 1 year ago

I would love to add support for this. At the moment, RasterDataset is designed only for files on the local filesystem. If I'm understanding it correctly, TorchData would make it easier to integrate different optional data sources. We've talked about porting our datasets to TorchData but haven't gotten a chance to work on this yet. TorchGeo is entirely a volunteer-driven open-source project, so if you have cycles and want to work on this, I would be happy to review PRs or give suggestions. Although I don't have access to AWS, so my debugging and testing abilities may be limited.

adamjstewart commented 1 year ago

You might be interested in #1399

adamjstewart commented 11 months ago

Note that we technically support this in 0.5.0, although the user has to pass in a list of files themselves.