tue-mps / tapps

[CVPR 2024] Task-aligned Part-aware Panoptic Segmentation through Joint Object-Part Representations
https://tue-mps.github.io/tapps/
MIT License
7 stars 2 forks source link

about TAPPS #1

Closed Han-SHS closed 3 months ago

Han-SHS commented 3 months ago

Thank you very much for your work!

We want to get the type and number of types of objects in a picture, which seems to be the task scope of panoramic segmentation, so we found your job. But we noticed that the reasoning section of your code indicated that you could only reason on three data sets, which confused us. We would like to perform the above statistical work on KITTI data set and NYUv2 data set, and we would like to ask whether your work can fulfill our requirements after setting up, and if not, whether there are other panoramic segmentation tasks that can achieve our goals.

Looking forward to your reply!

DdeGeus commented 3 months ago

Dear @Han-SHS, thanks for your interest in our work! My apologies for the late response, the past weeks have been busy.

If you wish to obtain the number and types of objects in a picture, you can indeed use panoptic segmentation. Panoptic segmentation provides the segmentation masks and class labels for all foreground objects and background regions in an image. Additionally, part-aware panoptic segmentation (PPS), the task that TAPPS (our method) solves, provides the segmentation masks and labels for the parts within identified objects. If you also need this property for your application, TAPPS could be very useful for you. If not, you could use panoptic-segmentation-only methods like Mask2Former (https://github.com/facebookresearch/Mask2Former).

As for the datasets, we only trained TAPPS on Pascal Panoptic Parts and Cityscapes Panoptic Parts because these are currently the only 2 datasets with consistent annotations for the PPS task. We haven't tested how well TAPPS works if we train it on one of these datasets and apply it to KITTI or NYUv2, but this would be interesting to test. Our input pipelines currently don't support KITTI and NYUv2, but it should be relatively easy to adapt them to these datasets, or to create some code to only run inference on images, which seems to be what you require.

Does this answer your questions?

Han-SHS commented 3 months ago

Thank you very much for your reply. We understand what we want to know. Thank you again for your reply!

DdeGeus commented 3 months ago

No problem! Feel free to reach out again if you have further questions.