dvlab-research / PanopticFCN

Fully Convolutional Networks for Panoptic Segmentation (CVPR2021 Oral)
Apache License 2.0
391 stars 53 forks source link

Compare with SOLOV2 #25

Closed detectiveli closed 3 years ago

detectiveli commented 3 years ago

Dear contributor,

Many thanks for your excellent work.

After reading your paper, I am a little confused about the difference between your main framework with SOLOV2, as the kernel learning part is pretty similar to SOLOV2.

May I ask where the main contribution of the score improvement of (2.2 in Table 11) comparing with SOLOV2?

Kind Regards

yanwei-li commented 3 years ago

Hi! In general, Panoptic FCN encodes each object instance or stuff category into a specific kernel weight with the proposed kernel generator and produces the prediction by convolving the high-resolution feature directly. That means it produces the results in a simple generate-kernel-then-segment workflow.

The main difference between Panoptic FCN and SOLO V2 lies in that it represents Things and Stuff in a unified manner (encode each instance into a specific kernel), while SOLO V2 represents Things and Stuff differently (Things by locations and a separate branch for Stuff segmentation). Moreover, SOLO V2 needs Matrix NMS for duplicate removal in post-processing. In Panoptic FCN, we propose to use kernel fusion to aggregate the kernels that belong to the same object or stuff in kernel-level, with no need for pixel-level (or box-level) post-processing.

Hope this could help you.

detectiveli commented 3 years ago

Many thanks for your kind relay and it really helps me.

And one further discussion about the difference. Is there any experience results or mathematic analyse about the two differences you mentioned above?

yanwei-li commented 3 years ago

Hi, most of them are given in the main paper for your reference. Firstly, the concept difference (unified representation) is analyzed in Fig. 2 and the whole introduction section. We also give experimental comparisons in Table 11. Moreover, to illustrate the effectiveness of kernel fusion, we conduct ablation studies in Table 3 and compare with Matrix NMS (for post-processing) in Table 4.