siyuanliii / masa

Official Implementation of CVPR24 highligt paper: Matching Anything by Segmenting Anything
https://matchinganything.github.io
Apache License 2.0
1.01k stars 66 forks source link

The training results are quite different from the paper #42

Open mrmusad opened 2 weeks ago

mrmusad commented 2 weeks ago

Because I only have one graphics card, I changed the learning rate to 1/8 of the source file, 0.01, and kept other training parameters unchanged. I trained the MASA-gdino model and used the BDDMOT dataset for testing. the training results I got are as follows. I only used the first 10 tar compressed files of sa-1b for training, and did not use the sa-1b-500k dataset used in the paper. Could these errors be due to the dataset? image

mrmusad commented 2 weeks ago

The TETA in the paper is 54.5 and the IDF1 is 71.7, which is somewhat different from the results I obtained through training.

siyuanliii commented 1 week ago

Thanks for the question. There are many possibilities that can lead to the performance gap. Before we dig into the effect of different training images, there are some easier things to check. For example, what hyperparameters did you use for your tracker when testing on BDD100K? What detections do you use? What is the performance on the TAO dataset? I can better help you if we can have more info here.