-
Hi. I ran the command:
```bash
export HF_PATH="mistralai/Mixtral-8x7B-Instruct-v0.1"
scripts/huggingface_example.sh --type llama --model $HF_PATH --quant int4_awq --tp 4
```
On a Node with 8 …
-
I'm running inspect log viewer locally with `inspect view` where the viewer works well on `localhost:7575` but I see strange failures when I export it with ngrok (`ngrok http 7575`)
```
GET https:…
-
Question regarding QMM, QMV, and QVM kernels. When using the Metal Debugger on them I noticed that no matter the data type used for the operation itself (e.g. float16) all the simdgroup_matrix operati…
-
### 问题类型 | Type of problem
模型推理和部署 | Model inference and deployment
### 操作系统 | Operating system
Linux
### 详细描述问题 | Detailed description of the problem
使用主分支最新代码。Docker部署,安装依赖。
推理Qwen-1…
-
Im running a runpod serverless vLLM template with Llama 3 70B on 40GB GPU. One of the requests failed and I'm not completely sure what happened but the message asked me to open a github issue so I'll …
-
### 软件环境
```Markdown
- paddlepaddle: 0.0.0
- paddlepaddle-gpu: 0.0.0.post118
- paddlenlp: 2.6.1
```
### 重复问题
- [X] I have searched the existing issues
### 错误描述
```Markdown
PredictorArgument(mo…
-
### Your current environment
```text
Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
…
-
### Your current environment
Collecting environment information...
WARNING 04-02 01:12:23 ray_utils.py:70] Failed to import Ray with ModuleNotFoundError("No module named 'ray'"). For distributed inf…
-
ipex-llm + deepspeed run Qwen-7B-Chat with the following error:
[0] RuntimeError: shape '[1, 1024, 16, 128]' is invalid for input of size 4194304
accelerate 0.29.2
mpi4py …
-
## 🐛 Bug
When exporting a model with `torch.onnx.export` I receive the following error
```
File "/venv/lib/python3.8/site-packages/torch/onnx/utils.py", line 709, in _export
proto, export_ma…