-
I don't understand to set the chat_llm to ollama, if there is no preparation for utility_llm and/or embedding_llm to set it to local (ollama) pendants. Yes, I assume that prompting will be a challenge…
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
I don't use huggingface because of proxy issue. So, I tried to make local embeddin…
-
### Describe the bug
when providing an assistant ID for GPTAssistantAgent. the code pathway at line 117 always has a None value for variables "instructions" and "specified_tools". this is because the…
-
can ollama URL be configured to point to remote box?
or try use ssh tunnel to make remote ollama appear to be local
-
开发机:ubuntu 20.04 mnn 3.0.0
模型 huggingface:Qwen2.5-0.5B-Instruct 和 Qwen2.5-0.5B-Instruct-GPTQ-Int8
## 导出 onnx 模型
$ python mnn/transformers/llm/export/llmexport.py --path pretrained_model/Qwen2.5…
-
### System Info
ubuntu 22.04
torch 2.5.0
cuda 12.4
running on a single gpu with CUDA_VISIBLE_DEVICES=1
![image](https://github.com/user-attachments/assets/30134067-427a-4421-94d1-8d958ec628f5)
…
-
### System Info
I am experimenting with TRT LLM and `flan-t5` models. My simple goal is to build engines with different configurations and tensor parallelism, then review performance. Have a DGX syst…
-
### 🚀 The feature, motivation and pitch
Is the deepseek-v2 AWQ version supported now? When I run it, I get the following error:
```
[rank0]: File "/usr/local/lib/python3.9/dist-packages/vllm/mo…
-
### Describe the bug
Function __post_carryover_processing(chat_info: Dict[str, Any]) of chat.py in agentchat folder throw the above exception when running Google Gemini.
The cause of the problem w…
-
Hi!
I am planning to test pretraining OLMo 1B model on the slim pajama dataset. I was trying to follow the tutorial for tinyllama but one of the steps for preparing the dataset uses the `litgpt/data/…