Inconsistent Scores with Example Inference Script

illuin-tech / colpali

The code used to train and run inference with the ColPali architecture.

https://huggingface.co/vidore

MIT License

850 stars 75 forks source link

Inconsistent Scores with Example Inference Script #11

Closed hannah348 closed 1 month ago

hannah348 commented 2 months ago

I am trying to run some evaluation but with the example scripts scripts/infer/run_inference_with_python.py I get different scores every time I run it.

My current hypothesis is that the weights of the custom_text_proj are randomly initialized and loading the adapter only adds a delta to the weights in form of the LoRA. Hence, the projection would be different every time I load the model. Could that be the case or is something else going on? How do I load or initialize the weights of the projection?

ManuelFay commented 2 months ago

Yes that's exactly what's happening. 100% agree it's suboptimal (although not really hindering the perfs). I'm going to update the code to export everything and not just the adapters, not as lightweight but at least reproducible and will push a full checkpoint. It's actually a lot more tricky than I thought with the PEFT library but I'll make it work and post here !

ManuelFay commented 1 month ago

Hey - so everything should be deterministic now ! Would be awesome if you guys can confirm using this new model: https://huggingface.co/vidore/colpali-v1.1

and the code in branch: https://github.com/illuin-tech/colpali/tree/hard-negs (optional but should get you better performance and fixes a padding issue)

The base model version is fixed !