gritzner / SegForestNet

Reference implementation of SegForestNet
BSD 3-Clause "New" or "Revised" License
21 stars 1 forks source link

Error on Loading toulouse Dataset #1

Closed davidchd closed 1 year ago

davidchd commented 1 year ago

I ran into an issue when trying to load the toulouse dataset. I downloaded the toulouse dataset from SemCity. It appears to me that images in the semantic_05/TLS_indMap_noGeo folder have filenames like "TLS_indMap_noGeo_XX_Y.tif", where XX is a number with 2 digits and Y is a number with 1 digit. However, I think the DatasetLoader for toulouse dataset loads the image as "semantic_05/TLS_indMap_noGeo/TLS_indMapnoGeo{file_id}.tif", without the Y value.

Screenshot 2023-03-24 at 1 03 57 AM Screenshot 2023-03-24 at 12 54 58 AM

I'm not sure if I have downloaded the right dataset. Could you provide more info regarding the specific dataset you used as well as any necessary preprocess you implemented?

gritzner commented 1 year ago

We seem to have an older version of the dataset at our institute, as the files here are indeed named differently: image At least I think that our version might be old/outdated as the SemCity Toulouse paper claims that semantic ground truth for all 16 patches was released and none of the subfolders in semantic_05/ at our institute contain label images for any image patch other than the four patches for which instance labels are available. Our dataset version is from July 1st, 2020. I will look into this issue, however, I cannot promise how soon I will be able to commit a fix for newer versions of the dataset.

gritzner commented 1 year ago

Quick update: I just found out that they later added annotations made by different annotators. For consistent labeling, i.e., labels made by the same annotator, change line 57 in datasets/SemcityToulouseDatasetLoader.py to: img_set.append(PIL.Image.open(f"{root_path}/semantic_05/TLS_indMap_noGeo/TLS_indMap_noGeo_{file_id}_1.tif"))

EDIT: the old labels seem to have been created by annotators 2 and 3 based on a MD5 checksum comparison. I will eventually (again: no promises on when) commit a change that loads the proper files with the new naming scheme that will load exactly the files I used my paper.

image

gritzner commented 1 year ago

Found some time to fix the issue (was really simple once I looked into it ;-) ) and fixed it in commit 9dd4c8d.

AzkaBasit commented 4 months ago

heyy @davidchd can you please share the website from where you got this dataset? or any other dataset used in this repository?

gritzner commented 3 months ago

The URL of the ISPRS 2D Semantic Labeling Contest changed to https://www.isprs.org/education/benchmarks/UrbanSemLab/semantic-labeling.aspx since I last checked (i.e., the URL in my paper/article is outdated, sorry). You can download the Vaihingen and Potsdam datasets on the overarching website https://www.isprs.org/education/benchmarks/UrbanSemLab/default.aspx which includes, among other things, the 2D Semantic Labeling Contest.

The Toulouse dataset can be found here: http://rs.ipb.uni-bonn.de/data/ . I hope that is the most recent URL, at least it is the one referenced in the conclusion of the Toulouse dataset paper [1].

EDIT: The other datasets (Hannover, Buxtehude, Nienburg, Schleswig, and Hameln) are not publicly available and unfortunately cannot be published for legal reasons.

[1] R. Roscher, M. Volpi, C. Mallet, L. Drees, J. D. Wegner, "Semcity toulouse: A benchmark for building instance segmentation in satellite images", ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences V-5-2020 (2020) 109–116. doi:10.5194/isprs-annals-V-5-2020-109-2020. https://www.isprs-ann-photogramm-remote-sens-spatial-inf-sci.net/V-5-2020/109/2020/