Deviation in Reproduction of Results

Dear authors, Great work proposing zero-shot NLVL, and thank you for making the code publicly available!

However, when I try to reproduce the results on my server, I notice that the results are off by ~3-5 points for higher Recall@k measures (Recall @ {0.5, 0.7}). It seems like I may be missing something, because this is consistently the case across multiple reproductions. I would really appreciate any suggestions in this regard!

To the best of my knowledge, all the training conditions are the same as listed since I am using the config file provided in the repository as is (except minor changes to the DATA_PATH field).

Sharing the reproduced results obtained vs the reported results in the paper for your reference:

*Model*	*mIoU*	*Recall@0.3*	*Recall@0.5*	*Recall@0.7*
PSVL (Reproduced)	29.91	46.48	26.56	11.23
PSVL (Reported)	31.24	46.47	31.29	14.17

Thank you in advance!

gistvision / PSVL

Deviation in Reproduction of Results #6