microsoft / torchgeo

TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
https://www.osgeo.org/projects/torchgeo/
MIT License
2.35k stars 300 forks source link

Zero mask in ISPRS Potsdam #1727

Open mizoru opened 8 months ago

mizoru commented 8 months ago

Description

The file top_potsdam_4_12_label.tif in 5_Labels_all.zip isn't actually the mask, but seemts to be the mask overlayed on the RGB image. This leads to the 8th sample in the dataset having an all-zero mask. 5_Labels_for_participants.zip does contain the proper mask. Substituting with the image from that archive seems to solve the issue. Making it so that rgb_to_mask gives a warning about pixels not in the colormap would be good as well.

Steps to reproduce

  1. Download the dataset from the official link
  2. Check the file mentioned above (it's clear from its size that it's different).

Version

0.5.0

adamjstewart commented 4 months ago

@isaaccorley you added the original Potsdam data loader, can you take a look into this?

isaaccorley commented 4 months ago

I took a look at this and I think the dataset may have been updated or something because I have a prior version of Potsdam that doesn't seem to have this issue.

isaaccorley commented 4 months ago

I'll make a PR once I figure out the best way to handle this in the dataset class.

adamjstewart commented 4 months ago

We haven't yet found a license for it so we can't just rehost on HF 😢

isaaccorley commented 4 months ago

The download times for their server are awful for me. It hosts Vaihingen as well

adamjstewart commented 4 months ago

Can you try reaching out to them and find out about this bug and the license? If we know the license we can likely rehost on HF.