CrocoV2 performance on object recognition benchmarks.

truncs commented 1 month ago

Even though CrocoV2 wasn't trained on a object centric dataset I am curious whether there was some benchmarking done on Cars dataset (which might have some overlap) or if image segmentation was evaluated on CityScapes or equivalent.

The paper does mention that for ADE20K 44.7 IoU was obtained. Was this a linear probe or multiple scales were used? It would be interesting to know whether the performance does increase if more data is used.

PhilippeWeinzaepfel commented 1 month ago

Hi,

We had evaluating CroCo-Stereo and CroCo-Flow on datasets such as Kitti that contain urban environments and results were already pretty strong. Also, in dust3r, the model is finetuned including datasets like CO3D and state-of-the-art results are obtained on such object-centric datasets. So I would say that it is probably better to have such data in the pre-training stage, but having them in the fine-tuning stage might also be enough.

The ADE20K protocol is the same as MultiMAE. I don't remember the exact protocol but from what I remember, this is not linear probe, this is single scale, and this is on a resized version of the dataset; to be double checked.

Best Philippe

truncs commented 1 month ago

Got it! So can I conjecture based on this that Croco will perform competitively if there was enough multiview object centric data?

PhilippeWeinzaepfel commented 1 month ago

Yes

naver / croco

CrocoV2 performance on object recognition benchmarks. #30