tianweiy / MVP

MIT License
270 stars 37 forks source link

MaskFormer pretrained on coco-panoptic #23

Open SxJyJay opened 2 years ago

SxJyJay commented 2 years ago

Hi, I notice that you generate instance masks for KITTI using the maskformer pretrained on coco-panoptic. I am wondering whether the domain gap between COCO and KITTI will lead to unsatisfied instance segmentation performances. Because I notice that the improvements on KITTI is not as drastic as on nuScenes.

tianweiy commented 2 years ago

indeed, the coco model is not very good for KITTI. There are domain and class definition differences (e.g. we only have person in coco, but kitti 3d has both pedestrian / cyclists/ or person inside a vehicle). All of these makes training an KITTI segmentation model hard.

If you are interested in multimodal fusion on KITTI, one similar paper i noticed is https://github.com/LittlePey/SFD which also creates virtual points but filters the points based on 3d box proposal instead of image segmentation in our paper~(the latter is not very good due to reasons above)