-
### 你是否已经阅读并同意《Datawhale开源项目指南》?
- [X] 我已阅读并同意[《Datawhale开源项目指南》](https://github.com/datawhalechina/DOPMC/blob/main/GUIDE.md)
### 你是否已经阅读并同意《Datawhale开源项目行为准则》?
- [X] 我已阅读并同意[《Datawhale开源项目行为…
-
### Your current environment
...
### How would you like to use vllm
I have downloaded a model. Now on my 4 GPU instance I attempt to quantize it using AutoAWQ.
Whenever I run the script below I ge…
-
详细报错信息
Traceback (most recent call last):
File "/mnt/d/ai/swift/swift/cli/export.py", line 5, in
export_main()
File "/mnt/d/ai/swift/swift/utils/run_utils.py", line 32, in x_main
res…
-
### Description
We realized that the function xASL_quant_AgeSex2Hct is not anywhere implemented in ExploreASL:
![Screenshot 2024-09-27 at 15 21 19](https://github.com/user-attachments/assets/c345a…
-
## Minimum Reproducible Example
## Steps to Reproduce:
1. Run the Docker build command:
```bash
docker build -f Dockerfile.cuda-all . -t mistral
```
2. Observe the error during th…
-
There's useless DQ node in matmul_model_quant_io.onnx
![useless_dq_node](https://github.com/user-attachments/assets/2ef0506f-c8c0-4f8c-a600-db621643e51f)
Also have some questions:
1. The model …
-
### System Info
ubuntu 20.04
tensorrt 10.0.1
tensorrt-cu12 10.0.1
tensorrt-cu12-bindings 10.0.1
tensorrt-cu12-libs 10.0.1
tensorrt-llm …
-
Hey, team, AO provides awesome FP8 support with torch compile to get speed and memory improvement, however since torch compile is not always easily applicable for some models such as [MoE HF implement…
-
-
### 🚀 The feature, motivation and pitch
Originally discussed here with @drisspg :
- https://github.com/pytorch/pytorch/pull/137526#issuecomment-2401115408
This would be good for exercising Flex…