-
### System Info
Linux k8s-node2 6.5.0-41-generic #41~22.04.2-Ubuntu SMP PREEMPT_DYNAMIC Mon Jun 3 11:32:55 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c…
-
I have noticed that Alpaca uses my CPU instead of my GPU. Here's a screenshot showing how it's using almost 40% of my CPU, and only 1% of my GPU.
![Captura desde 2024-07-10 06-51-39](https://github…
-
I tried using the openai base url locally(via curl) and it worked. this is the log:
```info
raycast | 2024-07-14 11:10:11,112 MainThread main.py :72 INFO : Received request to /api/v1/me
…
-
### What happened?
I'm setting up librechat on a remote hosting with nginx. The register and interface is working fine. I've followed the tutorial https://www.librechat.ai/docs/remote/nginx and im do…
-
For each model, how is the value (`cls`, `mean`, `last_token`) of `pooling_type` determined?
```python
embeddings_model_spec = {
}
embeddings_model_spec['E5-mistral-7b']={'model_name':'intfloat/…
-
### Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangChain documentation with the integrated search.
- [X] I used the GitHub search to find a sim…
-
### Anything you want to discuss about vllm.
we will finetune a 70B model that support long content with 800k, can vllm support to inference this model?
yunll updated
6 months ago
-
I've downloaded the latest code and run the indexing. The *_final*.parquet files are not being created in the output/artifacts directory.
I ran the GraphRAG from command line using Microsoft git re…
-
### Model description
It would have been awesome if TEI supports SFR-Embedding-Mistral, which figures on the top of the mteb : https://huggingface.co/Salesforce/SFR-Embedding-Mistral
### Open source…
-
I wonder if it would make sense to support compressed requests, esp. for /rerank, where the query and document list could be many 1k or 2k chunks of text? The incoming request could easily exceed 20 …