Closed robmarkcole closed 1 month ago
I get the normalization fix but why is the README code snippet being changed? Once 0.6.0 is released, the snippet should work. Also as I understand it, the goal of the snippet is to show how the dataloader works not how to train using datamodules
I just assumed the docs predate the datamodules but we would like to showcase this feature - personally I always use datamodules now but perhaps am in the minority there. Suggest a second opinion before I change the PR
There is some fishy behaviour still - in the load_target
the image id has one subtracted here - why not use as below?
annot = self.coco.loadAnns(self.coco.getAnnIds(imgIds=id_))
Applied black as per the guide, and this has changed ALL of the strings
Applied black as per the guide, and this has changed ALL of the strings
torchgeo no longer uses black, isort etc and has recently switched to ruff for linting and code formatting.
There is some fishy behaviour still - in the
load_target
the image id has one subtracted here - why not use as below?annot = self.coco.loadAnns(self.coco.getAnnIds(imgIds=id_))
Refer to https://github.com/microsoft/torchgeo/pull/847#discussion_r1044127463
@ashnair1 I checked the approach I suggested (imgIds=id_)
) and this results in erroneous masks - strange as the API appears to support this approach, suggesting the implementation has some quirks
The approach I am using elsewhere:
1) Get list of image ids: self.ids = list(sorted(self.coco.imgs.keys()))
and use in getitem: id_ = self.ids[index]
2) Access filename using id: `self.coco.imgs[id]['filename'] 3) Access annotations using API:
annot = self.coco.loadAnns(self.coco.getAnnIds(imgIds=id))`
The README has a few different sections. The section you're editing is designed to show off our benchmark datasets. There is a separate section below specifically for PyTorch Lightning, including data modules. I would prefer to keep using a dataset instead of a data module here.
Actually, I think we're better off choosing a different dataset (maybe EuroSAT?). VHR-10 is too complicated and makes it look like TorchGeo is hard to use. Maybe we could move VHR-10 to the Lightning section of the README since it has a pretty visualization.
Yes, this is a common issue with plotting via dataset vs. plotting via data module. I agree with your solution, and proposed basically the same thing for all datasets in #1263. This issue is not specific to VHR-10, but affects almost all datasets/data modules. I haven't had the time/energy to fix this for all datasets, but contributions are definitely welcome here, even if it's only 1 dataset at a time.
Can we separate the README and plotting changes into two separate PRs?
Note: I have uploaded the dataset at https://huggingface.co/datasets/satellite-image-deep-learning/VHR-10 - can transfer to torchgeo org if you want to use it here
It's easier to download tar/zip/rar files instead of raw data. But I'm happy to change the download to your org or ours. Not sure if we should rehost all rar datasets as tar/zip to make them easier to extract?
@adamjstewart this PR now just addresses the plotting.
RE dataset choice, I agree EuroSAT would be a nice choice - it also supports selecting subsets of bands so shows immediately what is possible with torchgeo
Fix plotting using percentiles
Note: I have uploaded the dataset at https://huggingface.co/datasets/satellite-image-deep-learning/VHR-10 - can transfer to torchgeo org if you want to use it here