Closed DianeBouchacourt closed 1 year ago
Yes, I think you are right, thanks for bringing this up. Definitely should have added `.eval(). Just pushed an update adding it. Sorry for missing this.
I quickly ran NegCLIP, CLIP, BLIP and XVLM after the change, getting 0.804
, 0.590
, 0.585
and 0.736
for VG-R respectively.
It seems you do not call
model.eval()
before computing the scores (at least in the notebook example). Indeed,torch._no_grad_()
does not disable modules like Dropout This raises problems, as scores are non deterministic:text_encoder
is not deterministic.It might also raise problems for other models but I haven't checked. Shouldn't we add a call to
model.eval()
?