Reproducing the results of baseline w/ original CLIP

facebookresearch / ov-seg

This is the official PyTorch implementation of the paper Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP.

Other

689 stars 61 forks source link

Dear author,

Thanks for the great work! Could you tell me how many epochs are needed to train baseline w/ original CLIP? I've trained 2 epochs(10,000 iters), and the loss seems to have converged already. However, the testing results have a gap with your results( I have tested the weight you provided and the result was 29.6 on ADE-150 which was perfect. However, my result is only 18.0, and according to your paper, it should be 21.8). Could you help me out?

Here are my training results and my training logs: 06bc78c3cfd3fb8316f059e9668d746

Thanks.

facebookresearch / ov-seg

Reproducing the results of baseline w/ original CLIP #20