huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
131.98k stars 26.29k forks source link

Removing last element of class_queries_logits is not appropriate when do_reduce_labels is set to false. #32630

Open taghizad3h opened 1 month ago

taghizad3h commented 1 month ago

https://github.com/huggingface/transformers/blame/50837f20608e9266e3b9a930550fa6104230ccc7/src/transformers/models/mask2former/image_processing_mask2former.py#L998

I see that in the mentioned line we have

        masks_classes = class_queries_logits.softmax(dim=-1)[..., :-1]

which removes last class probability (may be the null class). It's not always true. For example if we don't set do_reduce_labels to true we need the background class as well. Or suppose we have two classes one for background and the other foreground. In that case when argmax is applied on dimension 1 afterwards the argmax always chooses 0 index because we have 1 element in that dimension.

qubvel commented 1 month ago

Hi @taghizad3h, thanks for the issue!

Mask2Former adds an additional class label for the "void" class, it does not depend on the do_reduce_labels option and is needed to filter out queries without class (background can also be considered as a class). That is why we slice this "void" class from logits after applying softmax.

github-actions[bot] commented 21 hours ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.