Closed NohTow closed 1 month ago
now i'm on the branch still get a bit different result (while the embeddings are close), might be related to precision
the mixed precision manager brought the minor diff, disable it brings identical result. LGTM!
The outputs are equivalent to RAGatouille's encode_index_free_queries
and encode_index_free_documents
so I think we are good.
You can also use model_kwargs={"torch_dtype": torch.float16}
to handle mixed precision in PyLate.
quick follow up on my end (just keep transparency), PR of sample usage in jina-colbert repo: https://huggingface.co/jinaai/jina-colbert-v2/discussions/8
@NohTow Would it be possible to update the models
documentation and add a note on how to load Jina model ?
Otherwise everything look good to me, great to see we support Jina model
model = models.ColBERT(
model_name_or_path="jinaai/jina-colbert-v2",
query_prefix="[QueryMarker]",
document_prefix="[DocumentMarker]",
attend_to_expansion_tokens=True,
trust_remote_code=True,
)
I added a tip saying that we handle nlp-stanford models and added documentation for the Jina model (and added it to the BEIR tab aswell).
After discussions with @bwanglzu, I realized the output of jina-colbert-v2 were not identical to the ones using stanford-nlp.
The problem was two-folds:
[unused0]
and[unused1]
, whereas they actually use[QueryMarker]
and[DocumentMarker]
. As this parameter is not directly readable from the repositories, my proposed solution is to let the user define the prefixes when loading the model. The PR had the ability to set the prefixes for stanford-nlp models and only default to unused if not set. It still default to[Q]
and[D]
if not set and not a stanford repo.attend_to_expansion_tokens
to True. I do not have a way to read this from the repository either. These parameters are stored in the PyLate configurations when saving the model though.Thus, the loading of Jina-colbert-v2 looks like this: