quantizing Search Results

1000+ results
for quantizing

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/pytorch #44111

fx: unable to symbolically trace simple nn.Sequential model

## 🐛 Bug Seeing errors when trying to trace simple models based on `nn.Sequential`: ``` Traceback (most recent call last): File "/home/vasiliy/nfs/pytorch_scripts/gm_sequential_bug.py", line…

vkuzo updated 3 weeks ago
10
Xilinx/finn #1041

TopK is not converted to LabelSelect_hls

dev branch: e188b4c50955105717b223862c4e26e4777852ea ## Quick summary I have my simple mnist model, I want to have TopK post processing for it. However TopK node is not converted to LabelSele…

pbk20191 updated 5 months ago
3
city96/ComfyUI-GGUF #11

how to convert a finetuned flux model to gguf ?? maybe to Q2…

I recently got my hands on a H100 VM for 10 days, and i tried to finetune flux on it, and i got pretty good results. I want to run it on my tiny gpu with only 4gb vram, i dont want to use cpu offloadi…

Meshwa428 updated 1 month ago
20
hiyouga/LLaMA-Factory #2567

用int4训练后，可以加载，但导出时，如果选择量化登记4，则失败

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction 用int4训练后，可以加载，但导出时，如果选择量化等级4，则失败。错误提示为：ValueError: Please merge adapters before quantizing the m…

kynow2 updated 4 months ago
3
ggerganov/llama.cpp #5518

Wrong number of tensors when run inference

Please include information about your system, the steps to reproduce the bug, and the version of llama.cpp that you are using. If possible, please provide a minimal code example that reproduces the bu…

hoaileba updated 4 months ago
2
pytorch/pytorch #134449

[Inductor] torchchat segmentation fault in int8 using max-au…

### 🐛 Describe the bug Torchchat of int8 woq will encounter segmentation fault in https://github.com/pytorch/pytorch/commit/f951fcd1d7c4e991d1c9ef642fe7761d7104cda2 when using `max-autotune` in tor…

yanbing-j updated 2 weeks ago
7
pytorch/executorch #3588

[Segmentation fault] python3 torchchat.py export stories15M…

https://github.com/pytorch/torchchat/actions/runs/9047866134/job/24860312456?pr=751 This is a launch blocker for torchchat because it causes a fail for users following the example commands in our d…

mikekgfb updated 4 months ago
5
ggerganov/llama.cpp #6067

GGML_ASSERT: ggml-quants.c:11615: besti1 >= 0 && besti2 >= 0…

When quantizing https://huggingface.co/Undi95/Plap-8x13B with this imatrix http://data.plan9.de/Plap-8x13B.imatrix quantize crashes with many messages as in the title (probably one per thread). qua…

schmorp updated 4 months ago
2
huggingface/transformers #31479

Quantized T5EncoderModel cannot be removed from VRAM on CUDA…

### System Info - `transformers` version: 4.42.0.dev0 - Platform: Linux-5.15.0-79-generic-x86_64-with-glibc2.35 - Python version: 3.10.12 - Huggingface_hub version: 0.23.4 - Safetensors version: …

lstein updated 1 month ago
8
microsoft/Olive #852

Is this pass flow possible for Stable Diffusion?: OrtTransfo…

**Describe the bug and context** I'm trying to quantize an optimized Stable Diffusion model. I got to know that `IncDynamicQuantization` has less reduction in inference speed than `OnnxDynamicQuanti…

lshqqytiger updated 8 months ago
46

上一页 1...87 88 89 90 91 92 93...100 下一页

1000+ results for quantizing

1000+ results
for quantizing