-
## 🐛 Bug
Seeing errors when trying to trace simple models based on `nn.Sequential`:
```
Traceback (most recent call last):
File "/home/vasiliy/nfs/pytorch_scripts/gm_sequential_bug.py", line…
vkuzo updated
3 weeks ago
-
dev branch: e188b4c50955105717b223862c4e26e4777852ea
## Quick summary
I have my simple mnist model, I want to have TopK post processing for it.
However TopK node is not converted to LabelSele…
-
I recently got my hands on a H100 VM for 10 days, and i tried to finetune flux on it, and i got pretty good results. I want to run it on my tiny gpu with only 4gb vram, i dont want to use cpu offloadi…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### Reproduction
用int4训练后,可以加载,但导出时,如果选择量化等级4,则失败。
错误提示为:ValueError: Please merge adapters before quantizing the m…
-
Please include information about your system, the steps to reproduce the bug, and the version of llama.cpp that you are using. If possible, please provide a minimal code example that reproduces the bu…
-
### 🐛 Describe the bug
Torchchat of int8 woq will encounter segmentation fault in https://github.com/pytorch/pytorch/commit/f951fcd1d7c4e991d1c9ef642fe7761d7104cda2 when using `max-autotune` in tor…
-
https://github.com/pytorch/torchchat/actions/runs/9047866134/job/24860312456?pr=751
This is a launch blocker for torchchat because it causes a fail for users following the example commands in our d…
-
When quantizing https://huggingface.co/Undi95/Plap-8x13B with this imatrix http://data.plan9.de/Plap-8x13B.imatrix quantize crashes with many messages as in the title (probably one per thread).
qua…
-
### System Info
- `transformers` version: 4.42.0.dev0
- Platform: Linux-5.15.0-79-generic-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.23.4
- Safetensors version: …
-
**Describe the bug and context**
I'm trying to quantize an optimized Stable Diffusion model.
I got to know that `IncDynamicQuantization` has less reduction in inference speed than `OnnxDynamicQuanti…