baaivision / EVA

EVA Series: Visual Representation Fantasies from BAAI
MIT License
2.2k stars 162 forks source link

different weight decay between code and paper for clip 18b #145

Closed Novestars closed 5 months ago

Novestars commented 5 months ago

In the paper, wd was 0, while in the code base wd is set to default value which is 0.02

Quan-Sun commented 5 months ago

@Novestars Appreciate you bringing this to our attention. The wd is set to 0 during the training of both EVA-CLIP-18B and EVA-CLIP-8B models.