-
Hi! I try to reproduce the benchmark [results](https://github.com/pytorch/ao/tree/main/torchao/quantization#benchmarks) using torchao/_models/llama/generate.py. However, I can not benchmark the quanti…
-
Error while quntising pretrained_model_dir = "tiiuae/falcon-7b" :-
2023-07-18 10:48:21 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 2/32...
Traceback (most recent call la…
-
This tool is amazing, having tried scripting using the coreml library by hand, running into all kinds of fun issues, then trying this and it all being orchestrated/abstracted for you, this is excellen…
-
Hi! I'm trying to run the Q4_K_M quantization of Meta-Llama-3-8B-Instruct on my Mac (M2 Pro, 16GB VRAM) using llama-cpp-python, with the following test code:
```
from llama_cpp import Llama
llm4 …
-
I am working on quantizing resnet50 model. I tried to use the following command.
```
quantized_model = torch.quantization.quantize_dynamic(
resnet18, {torch.nn.Conv2d,torch.nn.Linear}, dtype=…
-
I noticed you're using quantizing/hashing to determine when to weld nearly coincident vertices in the Combine method.
Am I right in thinking that this would incorrectly overlook vertices that were …
-
### Search before asking
- [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar bug report.
### Ultralytics YOLO Component
Expo…
-
Hi @wondervictor ,I changed the associated config, checkpoint, and img-size in export_onnx.py.
![image](https://github.com/AILab-CVC/YOLO-World/assets/59815166/a9320cc6-19dc-469b-9136-211031244de2)
…
-
Hi,
My original trained file (.bin) is 318 MB.
After quantizing it reduced to 250 MB.
Is this much of reduction in the file size expected or there's a possibility of further reduction in size? …
-
Im very confused on the merging step. In Appendix B, the proof is solid, however there is no guarantee that the new matrix B is in integer format. In standard linear quantization, zeros are represente…