How did you train SparseEmbed?

raphaelsty / neural-cherche

Neural Search

https://raphaelsty.github.io/neural-cherche/

MIT License

320 stars 17 forks source link

How did you train SparseEmbed? #31

Open richardklafter opened 2 months ago

richardklafter commented 2 months ago

First, awesome project!

How did you train your model at https://huggingface.co/raphaelsty/neural-cherche-sparse-embed? Did you train it from scratch? I found an old copy of your sparsembed library. Was that library used or was this repository? What data did your train on exactly?

I am surveying various sparse embedding models and SparseEmbed while interesting has very little code or docs beyond the original google paper. Any assistance would be appreciated. Thanks!

raphaelsty commented 1 month ago

Hi @richardklafter, I used the msmarco dataset, trained using neural-search without negative samples. The checkpoint is either a distilbert or a co-condenser, don't remember. I trained the model with a single GPU on google colab, I wasn't aiming for extraordinary accuracy. I'm sure it's easy to do better.

richardklafter commented 1 month ago

Thanks for letting me know! Feel free to close this.

But, if you want to give me more specifics, a notebook or something, I would happily run it on an H100 and give you results. I am curious where this lands given I could find very few public implementations.