Which labels are used for evaluation on Fishyscapes LostAndFound?

ksakmann commented 2 years ago

I have a question regarding the evaluation of results on Fishyscapes LAF in the leaderboard.

The labels differ between LAF and FS-LAF:

the in-distribution part in Fishyscapes LAF is all 19 cityscapes classes.
for LAF the in-distribution label is a coarse annotation of the road in front of the car.

I would expect all evaluation metrics for Fishyscapes LAF to be computed wrt the Fishyscapes LAF labels. However, in the code there is this statement

Benchmark for road obstacles In the LostAndFound dataset, the obstacles are located on the road in front of the car. Using this prior knowledge, we can ignore the non-road part of the image - we require the method to find obstacles within the road area only. In practice, we limit the evaluation to pixels marked as "free space" or "obstacle" in the original LAF labels

Aren't metrics like FPR, AUROC, etc, very dependent on which type of labels are being used? Could you clarify this please, if possible? And in case just the LAF labels are used, then what is the intended purpose of the FS-LAF labels?

hermannsblum commented 2 years ago

Thanks for the question! You are right with your initial understanding. All evaluations in Fishyscapes take the 19 cityscapes classes as inliers. You are able to inspect this annotation also in the FS LAF validation set.

The comment in the code that you found is a special dataloader that was not used in the Fishscapes paper or website, but only for the experiments of this paper: https://arxiv.org/abs/2012.13633

ksakmann commented 2 years ago

Thanks a bunch for the quick answer! That clarifies it.

hermannsblum / bdl-benchmark

Which labels are used for evaluation on Fishyscapes LostAndFound? #11