-
### System Info
- `transformers` version: 4.28.0.dev0
- Platform: Linux-3.10.0-1160.81.1.el7.x86_64-x86_64-with-glibc2.17
- Python version: 3.11.2
- Huggingface_hub version: 0.13.3
- Safetensor…
-
From `fully_connected_common.cc` I see that filter weights must be symmetric, i.e. `zero_point=0`. How can I achieve this? Also, is it only possible by using quantization-aware training, or it can it …
-
### 🐛 Describe the bug
When I try to fill a quantization, my code causes an error:
RuntimeError: Exporting the operator fake_quantize_per_tensor_affine to ONNX opset version 9 is not supported. Supp…
-
**I have an ONNX model that contains convolutional layers but no fully connected layers. Upon inspection with Netron, I found that if a convolutional layer is not directly followed by a BatchNormaliza…
-
你好,请问是否支持量化的模型,比如gptq?
如果可以的话,按照比例计算的话,我有8张24g的显卡的话,用流水线并行,是不是可以lora 175b版本量化模型了?
谢谢~
-
### ONNX Model Compressor
### Quantization Tool Proposal
Intel Neural Compressor(INC) is a tool for generating optimized ONNX models and supports techniques like Post training quantization (P…
-
===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run
python -m bitsandbytes
and submit this information toget…
-
Dear,
We have a yolov3 tiny model that can run on the DPU. Quantization is ok but when compiling we get the following error:
[UNILOG][FATAL][XCOM_UNSUPPORT_QUANTIZATION][The fix info is error o…
-
With the latest version of bitsandbytes (0.39.0) library, isn't it possible to serialize 4-bit models then?
Thus this section should be updated to allow the user to save these models.
https://gith…
westn updated
9 months ago
-
**Description**
Naive Bayes-based Context Extension is a method that uses the idea of naive Bayes to extend the context handling length of large language models as long as there is enough computing…