microsoft / torchgeo

TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
https://www.osgeo.org/projects/torchgeo/
MIT License
2.72k stars 337 forks source link

Geolife dataset #672

Open salelkafrawy opened 2 years ago

salelkafrawy commented 2 years ago

Summary

Is there a plan to add the GeoLifeCLEF dataset to the current NonGeoDataset?

Rationale

it is used in well established machine learning problem to address species distribution modeling

adamjstewart commented 2 years ago

I'm unaware of anyone currently working on this dataset, but would love to have it in TorchGeo! At first glance, this dataset looks similar to the iNaturalist/GBIF/EDDMapS datasets I added recently. Those might be a good starting point if anyone wants to take a stab at writing a dataset. Happy to review any PRs!

benjamindeneu commented 1 year ago

Hi, I am part of the team working on the GeoLifeCLEF dataset. We are interested in integrating the dataset in torchgeo (the 2023 version is now available with more data and a European coverage). More generally we are also working on tools to facilitate the use of deep-learning models for ecologists (a community that is still unfamiliar with these methods). We would like to create a ready-to-use library for these ecologists and we are thinking of basing all the lower part directly on torchgeo.

adamjstewart commented 1 year ago

@benjamindeneu congrats on the 2023 release! We would love to have GeoLifeCLEF in TorchGeo, just open a PR and I can review it.

Out of curiosity, what do you envision adding to your library? I wonder if those features are already present in TorchGeo, or something we can add to TorchGeo directly. We already support 600+ models via timm and segmentation-models-pytorch, and have pre-trained models built into TorchGeo as well. We also have Lightning datamodules and trainers that allow you to train, validate, and test a dataset with only 4 lines of code.