Closed brando90 closed 2 months ago
Hey! This currently hasn't been done, but there's technically nothing stopping you from using a repurposed decoder as the backbone for a ColBERT model. The important thing would be to train it so that it can produce ColBERT-style representation, but it should work just as well (or better!) as it does for dense embeddings.
cool! @bclavie do you have any advice for how to build good embedding methods for mathematics? What is the best way to go about training ColBERT model in your opinon?
Related: https://github.com/stanfordnlp/dspy/discussions/1428
I was curious, given the highly open capable models like llama3 family of the need for specialized models for mathematics (e.g., DeepSeekMath or formal mathematics like Lean4, Coq, Isabelle that need fine-tuning to produce good embeddings in the first place likely) -- how can I use decoder models?
And actually, is it even advisable?
Reference: