model-quantization Search Results

1000+ results
for model-quantization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

vllm-project/vllm #8402

[Bug]: CUDA device detection issue with KubeRay distributed …

### Your current environment The output of `python collect_env.py` ```text Collecting environment information... PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch…

jradikk updated 4 days ago
6
ggerganov/llama.cpp #6444

Support QuaRot quantization scheme

A new, interesting quantization scheme was published, which not only reduces memory consumption (like current quantization schemes), but als reduces computations. > **[QuaRot: Outlier-Free 4-Bit In…

EwoutH updated 3 days ago
13
NVIDIA/TensorRT-LLM #2347

trtllm-bench "No module named 'tensorrt_llm.bench.datamodels…

### System Info CPU x86_64 GPU NVIDIA L20 TensorRT branch: v0.13.0 CUDA: NVIDIA-SMI 535.161.07 Driver Version: 535.161.07 CUDA Version: 12.5 ### Who can help? @kaiyux @byshiue ### Information…

activezhao updated 4 days ago
2
pytorch/pytorch #134860

JIT tracing a quantized model with hooks is broken

### 🐛 Describe the bug JIT tracing a quantized model that has forward_pre_hooks throws the following error: `RuntimeError: Couldn't find method: 'forward' on class: '__torch__.torch.ao.nn.intri…

UmaisZahid updated 1 month ago
2
amd/RyzenAI-SW #122

Error during YOLOv8s quantization with Ryzen AI quantizer (R…

I encountered an issue while trying to quantize the YOLOv8s model using the Ryzen AI quantizer. Below are the details of the error: ### Error Message: ``` No CUDA runtime is found, using CUDA_HOM…

Siva50005 updated 3 weeks ago
11
airockchip/rknn-toolkit2 #170

使用onnx_edit插入新维度后得到的模型无法转换

原模型输出结构： ![图片](https://github.com/user-attachments/assets/1572728b-d965-4bf3-8c19-aed8266f35c3) onnx_edit后结构: ![图片](https://github.com/user-attachments/assets/b5a1d2f6-b71c-4fbe-b71b-9408291a0e49…

happyme531 updated 6 days ago
2
city96/ComfyUI-GGUF #133

failed to quantize: unknown model architecture: 'flux'

Trying to quantise some flux models to lower the vram needs and I get that error. ``` (venv) C:\AI\llama.cpp\build>bin\Debug\llama-quantize.exe "C:\AI\ComfyUI_windows_portable\ComfyUI\models\chec…

GamingDaveUk updated 2 days ago
3
unslothai/unsloth #1137

Can not use unsloth on vphere with ubuntu vm (vGPU)

I have the same problem like this [cannot use unsloth](https://github.com/unslothai/unsloth/issues/820), but when I run the code below it is still got the same error : `os.environ['CUDA_VISIBLE_DEVIC…

NeilL0412 updated 4 days ago
4
microsoft/onnxruntime #21048

onnxruntime shape mismatch during quantization of yolov8 mod…

### Describe the issue When trying to quantize a Yolov8 model (exported with `yolo export model=yolov8x.pt format=onnx`) with `onnxruntime`, I get the following error: ``` $ python quantize.py yo…

Jamil updated 3 months ago
7
bitsandbytes-foundation/bitsandbytes #1262

quantization of T5 faild. int8 model cost more inference tim…

### System Info A100-80G cuda12.1 bitsandbytes 0.43.2.dev0 diffusers 0.29.1 lion-pytorch 0.2.2 torch 2.0.1 torch-tb-profiler 0…

Worromots updated 18 hours ago
1

上一页 1...12 13 14 15 16 17 18...100 下一页

1000+ results for model-quantization

1000+ results
for model-quantization