Setting up SentenceTransformerEmbeddingModel for GPU use.

Future-House / paper-qa

High accuracy RAG for answering questions from scientific documents with citations

Apache License 2.0

6.44k stars 618 forks source link

Setting up SentenceTransformerEmbeddingModel for GPU use. #641

Open Snikch63200 opened 4 weeks ago

Snikch63200 commented 4 weeks ago

Hello,

I finally managed to use a local embedding model (mxbai-embed-large-v1) using new SentenceTransformerEmbeddingModel class (thenks to developper team for this work !!! ;-)).

sparse_embedding_model = SparseEmbeddingModel(ndim=4096)
sentence_transformer = SentenceTransformerEmbeddingModel(name="mixedbread-ai/mxbai-embed-large-v1",
                                                         )
hybrid_model=HybridEmbeddingModel(models=[sparse_embedding_model,sentence_transformer])

It works fine but it's very slow because doesn't use GPU... Is it a way to force model to use GPU (in config options I guess...).

Thanks for help.

@Dosu.

dosubot[bot] commented 4 weeks ago

To configure the SentenceTransformerEmbeddingModel to use the GPU for the mxbai-embed-large-v1 model, you need to set the device parameter in the config dictionary to "cuda". Here's how you can modify your code:

sparse_embedding_model = SparseEmbeddingModel(ndim=4096)
sentence_transformer = SentenceTransformerEmbeddingModel(
    name="mixedbread-ai/mxbai-embed-large-v1",
    config={"device": "cuda"}  # Set the device to GPU
)
hybrid_model = HybridEmbeddingModel(models=[sparse_embedding_model, sentence_transformer])

This configuration will ensure that the model uses the GPU for computations ^[1].

To continue talking to Dosu, mention @dosu.

Snikch63200 commented 4 weeks ago

Works fine !! Thanks.

GPU is MUCH faster for this task !

Where could I find a complet lis of available options for 'config' dict (config={"device": "cuda"})

@Dosu.

dosubot[bot] commented 4 weeks ago

The available configuration options for the SentenceTransformerEmbeddingModel in the codebase are:

batch_size: Specifies the batch size for processing the texts. The default value is 32.
device: Specifies the device on which the model should run, such as "cpu" or "cuda". The default value is "cpu" ^[1]^[2].

To continue talking to Dosu, mention @dosu.