Create panoptic segmentation task guide

NielsRogge commented 5 months ago

Feature request

We currently already have a semantic segmentation task guide: https://huggingface.co/docs/transformers/tasks/semantic_segmentation. It would be great to create a similar one for panoptic segmentation with models like MaskFormer, Mask2Former and OneFormer.

Motivation

Panoptic segmentation deserves its own task guide as it's different from semantic segmentation.

Lots of people have reported issues, including the following:

[ ] computing mAP during MaskFormer training: https://github.com/NielsRogge/Transformers-Tutorials/issues/373
[ ] evaluation of DETR and friends on a panoptic segmentation dataset: https://github.com/NielsRogge/Transformers-Tutorials/issues/320
[ ] there was an effort to add the panoptic quality (PQ) metric to Evaluate: https://github.com/huggingface/evaluate/pull/408
[ ] various people want to use a COCO-formatted dataset for fine-tuning, but there's no guide yet regarding how to do this: https://github.com/NielsRogge/Transformers-Tutorials/issues/296

Your contribution

Can be based off the notebooks provided here: https://github.com/NielsRogge/Transformers-Tutorials/tree/master/MaskFormer

cc @qubvel

amyeroberts commented 4 months ago

@NielsRogge Thanks for creating this issue! I believe if you add the feature request tag this will prevent the issue from being marked as stale. It's great to add the vision tag to these too to make them easier to filter and spot for those interested in the listed issues

A-Duss commented 2 weeks ago

Hey @NielsRogge, I’d love to give it a try if that's fine with you.

qubvel commented 2 weeks ago

Hi @A-Duss, great! 🤗 You can mention me for a PR review or in case any help is needed!

A-Duss commented 2 weeks ago

My understanding is that examples of inference for semantic, instance and panoptic segmentation are already present in the docs (see Types of segmentation). However, a complete task guide (with data preprocessing, finetuning, evaluation and inference) only exists for semantic segmentation.

Given the issues tagged, you'd want:

Instance Segmentation evaluation code example (ideally with evaluate using mask mAP in coco format): https://github.com/NielsRogge/Transformers-Tutorials/issues/373
Panoptic Segmentation evaluation code (ideally on coco): https://github.com/NielsRogge/Transformers-Tutorials/issues/320
For this PR https://github.com/huggingface/evaluate/pull/408, I assume the idea for tagging it was to use the proposed panoptic quality metric in the example code. But since the PR still hasn't been merged yet I am not sure about what to do.
Finally, pretty clear, make it with COCO: https://github.com/NielsRogge/Transformers-Tutorials/issues/296

Is my understanding correct ? @NielsRogge @qubvel Do not hesitate to tell me exactly how you were thinking it, or if I'm off-track.

qubvel commented 2 weeks ago

Sounds good to me, you can also take a look at instance-segmentation examples in the repo regarding evaluation with Trainer and torchmetrics.

https://github.com/huggingface/transformers/pull/31084

huggingface / transformers