-
### System Info
I am experimenting with TRT LLM and `flan-t5` models. My simple goal is to build engines with different configurations and tensor parallelism, then review performance. Have a DGX syst…
-
感谢分享!我有如下错误请您帮助:
Traceback (most recent call last):
File "/root/miniconda3/envs/test/lib/python3.10/site-packages/transformers/configuration_utils.py", line 675, in _get_config_dict
resolved_…
-
### System Info
TEI Image v1.4.0
AWS Sagemaker Deployment
1 x ml.g5.xlarge instance Asynchronous Deployment
Link to prior discussion: https://discuss.huggingface.co/t/async-tei-deployment-c…
-
I'd like to run live llava completely locally on Jetson including a web browser.
However, if I turn off wifi before starting live llava, the video won't play on the browser.
If I turn off wifi after…
-
**Describe the bug**
Hi, all. Working on a blog article, following a mix of local documentation + Intelligent app workshop, but instead of going Falcon, I've gone with the Mistral 7b model. and at …
-
### Describe the bug
Function __post_carryover_processing(chat_info: Dict[str, Any]) of chat.py in agentchat folder throw the above exception when running Google Gemini.
The cause of the problem w…
-
### System Info
ubuntu 20.04
tensorrt 10.0.1
tensorrt-cu12 10.0.1
tensorrt-cu12-bindings 10.0.1
tensorrt-cu12-libs 10.0.1
tensorrt-llm 0.10.…
-
### System Info
ubuntu 22.04
torch 2.5.0
cuda 12.4
running on a single gpu with CUDA_VISIBLE_DEVICES=1
![image](https://github.com/user-attachments/assets/30134067-427a-4421-94d1-8d958ec628f5)
…
-
开发机:ubuntu 20.04 mnn 3.0.0
模型 huggingface:Qwen2.5-0.5B-Instruct 和 Qwen2.5-0.5B-Instruct-GPTQ-Int8
## 导出 onnx 模型
$ python mnn/transformers/llm/export/llmexport.py --path pretrained_model/Qwen2.5…
-
This is a really great local llm backend that works on a lot of platforms
(including intel macs) and is basically a 1-click install.
**Main site:** https://ollama.ai/
**API dosc:** https://githu…