Closed markstur closed 2 months ago
Thanks for the reviews!
Forgot to mention regarding the removal of ipex, etc code... Part of that I was keeping to get MPS support as well, but I've found out that the default with CrossEncoder handles MPS and CUDA device already.
This module is closely related to EmbeddingModule.
Cross-encoder models use Q and A pairs and are trained return a relevance score for rank(). The existing rerank APIs in EmbeddingModule had to encode Q and A separately and use cosine similarity as a score. So the API is the same, but the results are supposed to be better (and slower).
Cross-encoder models do not support returning embedding vectors or sentence-similarity.
Support for the existing tokenization and model_info endpoints was also added.