Closed detectiveli closed 3 years ago
Hi! In general, Panoptic FCN encodes each object instance or stuff category into a specific kernel weight with the proposed kernel generator and produces the prediction by convolving the high-resolution feature directly. That means it produces the results in a simple generate-kernel-then-segment workflow.
The main difference between Panoptic FCN and SOLO V2 lies in that it represents Things and Stuff in a unified manner (encode each instance into a specific kernel), while SOLO V2 represents Things and Stuff differently (Things by locations and a separate branch for Stuff segmentation). Moreover, SOLO V2 needs Matrix NMS for duplicate removal in post-processing. In Panoptic FCN, we propose to use kernel fusion to aggregate the kernels that belong to the same object or stuff in kernel-level, with no need for pixel-level (or box-level) post-processing.
Hope this could help you.
Many thanks for your kind relay and it really helps me.
And one further discussion about the difference. Is there any experience results or mathematic analyse about the two differences you mentioned above?
Hi, most of them are given in the main paper for your reference. Firstly, the concept difference (unified representation) is analyzed in Fig. 2 and the whole introduction section. We also give experimental comparisons in Table 11. Moreover, to illustrate the effectiveness of kernel fusion, we conduct ablation studies in Table 3 and compare with Matrix NMS (for post-processing) in Table 4.
Dear contributor,
Many thanks for your excellent work.
After reading your paper, I am a little confused about the difference between your main framework with SOLOV2, as the kernel learning part is pretty similar to SOLOV2.
May I ask where the main contribution of the score improvement of (2.2 in Table 11) comparing with SOLOV2?
Kind Regards