Using mistralai/Mistral-7B-v0.1 as a pre-trained model

UKPLab / sentence-transformers

Multilingual Sentence & Image Embeddings with BERT

Apache License 2.0

14.6k stars 2.41k forks source link

Hello!

This would be quite unusual. Mistral-7B is a great model, but it's a decoder (i.e. a model that produces text), not an encoder (i.e. a model that produces embeddings). A more common scenario would be to take a Sentence Transformer (encoder) model trained on sentence-similarity, use it with some embedding store/vector database to choose texts relevant to some query, and then feed those texts to the Mistral decoder. This is a standard Retrieval Augmented Generation setup that can be implemented using various different open source projects.

Tom Aarsen

UKPLab / sentence-transformers

Using mistralai/Mistral-7B-v0.1 as a pre-trained model #2320