-
### 🚀 The feature, motivation and pitch
I am trying to implement eager mode of PT2E quantization on CPU. Currently, the PT2E quantization on CPU is lowered to Inductor by `torch.compile`. The current…
-
How to use multi-gpu quantization
when i operate :
export MODEL_NAME_OR_PATH="/llama2_70B"
export OUTPUT_DIR="/llama2_70B_quantize"
export QUANTIZATION_SCHEME="fp8"
export DEVICE="cuda:0,1,2,3,4"…
-
- [x] pick a data export and bless as release version
- [x] do same with taxon range evals
- [x] download export and taxon range files
- [x] convert to parquet format
- [ ] delete unneeded photos …
-
### Initial Checks
- [X] I confirm that I'm using Pydantic V2
### Description
My understanding is that `@property` and `@cached_property` attributes should not be included in a model's export (`mod…
-
MiniCPM-2B 导出onnx时,在bais问题解决后,又出现了新的问题,我运行MiniCPM_Export.py:
在modeling_modified_A](https://github.com/DakeQQ/Native-LLM-for-Android/tree/main/Export_ONNX/MiniCPM/MiniCPM-2B/modeling_modified_A/modeli…
-
Starting: yolov9s_fp.pt
Opening YOLOv9 model
YOLOv9s summary (fused): 486 layers, 7,167,862 parameters, 0 gradients, 26.7 GFLOPs
Creating labels.txt file
Exporting the model to ONNX
Traceback (…
-
**Is your feature request related to a problem? Please describe.**
I am trying to save an optimized CLIP model to the disk, but it's too large (around 8 Gb if I remember correctly) for pickle to hand…
-
### Feature request
Support Musicgen Melody's ONNX exportation with audio prompting.
### Motivation
Currently, Optimum do not support export for Musicgen Melody models, The current implementation i…
-
I use the AIMET PTQ to quantize the CLIP text model.
But I encounter this error [KeyError: 'Graph has no buffer /text_model/encoder/layers.0/layer_norm1/Constant_output_0, referred to as input for …
-
### 🐛 Describe the bug
We tried to save and load the torch.exported dlrm_v2 model(97.5GB), the model repository is: https://github.com/mlcommons/inference/tree/master/recommendation/dlrm_v2/pytorch…