Open Harry-zzh opened 1 year ago
Hi, could you please provide the range of the learning rate, or other hyper-parameter settings for the zero-shot experiments on the COCO-20i dataset? It is difficult to reproduce the results shown in the paper. I use ViT-L/16 as backbone, and the results are 10 points lower than yours.
Hello, I wonder why zero-shot needs to be trained on COCO dataset? I mean, in my mind, zero shot means directly use the ADE20K trained model to test on COCO dataset. Actually, I don't understand why there is a lot of files with postfix _zs. Because that seems like I need to train the model again, and the architecture is different from origin model. And that is not zero shot I think.
Do you have any idea about this? Thank you!
Hi, could you please provide the range of the learning rate, or other hyper-parameter settings for the zero-shot experiments on the COCO-20i dataset? It is difficult to reproduce the results shown in the paper. I use ViT-L/16 as backbone, and the results are 10 points lower than yours.
Hello, I wonder why zero-shot needs to be trained on COCO dataset? I mean, in my mind, zero shot means directly use the ADE20K trained model to test on COCO dataset. Actually, I don't understand why there is a lot of files with postfix _zs. Because that seems like I need to train the model again, and the architecture is different from origin model. And that is not zero shot I think.
Do you have any idea about this? Thank you!
Hi, I think what you mentioned is one form of zero-shot setting. In lang-seg paper, they use another zero-shot setting where labels that are used for inference have never been seen during training. For example, the model is trained on COCO-20i dataset, where the ground truth categories used in training and inference are different.
As for other details of this repository, it has been a long time since I last used it, so I couldn't remember the details clearly.
Hi, could you please provide the range of the learning rate, or other hyper-parameter settings for the zero-shot experiments on the COCO-20i dataset? It is difficult to reproduce the results shown in the paper. I use ViT-L/16 as backbone, and the results are 10 points lower than yours.