-
I only modified t6 instead of t4, t4 t5 both work well for this model,but if we set the thread=6,will always trigger the problem on my XIAOMI14Pro(SM8650 8Gen3)
please check it for resolve
thanks~
…
-
开发机:ubuntu 20.04 mnn 3.0.0
模型 huggingface:Qwen2.5-0.5B-Instruct 和 Qwen2.5-0.5B-Instruct-GPTQ-Int8
## 导出 onnx 模型
$ python mnn/transformers/llm/export/llmexport.py --path pretrained_model/Qwen2.5…
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
I don't use huggingface because of proxy issue. So, I tried to make local embeddin…
-
### System Info
ubuntu 22.04
torch 2.5.0
cuda 12.4
running on a single gpu with CUDA_VISIBLE_DEVICES=1
![image](https://github.com/user-attachments/assets/30134067-427a-4421-94d1-8d958ec628f5)
…
-
no_gt retrieval metrics needs large amount of LLM processing.
So, use local LLM model to compute it.
+ ragas context precision need so much LLM calls. So, try to use tonic validate instead.
-
### Describe the bug
Function __post_carryover_processing(chat_info: Dict[str, Any]) of chat.py in agentchat folder throw the above exception when running Google Gemini.
The cause of the problem w…
-
Hi!
I am planning to test pretraining OLMo 1B model on the slim pajama dataset. I was trying to follow the tutorial for tinyllama but one of the steps for preparing the dataset uses the `litgpt/data/…
-
**Title:** Automatically label medical data from diagnosis reports
**Project Lead:** Frank Langbein, frank@langbein.org
**Description:** We wish to automatically label medical diagnosis data (MRI,…
-
### 🚀 The feature, motivation and pitch
Is the deepseek-v2 AWQ version supported now? When I run it, I get the following error:
```
[rank0]: File "/usr/local/lib/python3.9/dist-packages/vllm/mo…
-
can ollama URL be configured to point to remote box?
or try use ssh tunnel to make remote ollama appear to be local