Closed juntaosun closed 3 months ago
I will add it to the conversion script, but you essentially just need to convert the q4 model to fp16 with:
from onnxconverter_common import float16
import onnx
import os
loaded_q4_model = onnx.load_model('./model_q4.onnx')
model_q4fp16 = float16.convert_float_to_float16(
loaded_q4_model,
keep_io_types=True,
disable_shape_infer=True,
)
save_path = './model_q4f16.onnx'
onnx.save(model_q4fp16, save_path,
convert_attribute=False,
all_tensors_to_one_file=True,
)
Question
Thanks for providing such a great project, but I have a problem converting the model.
What command is used to create and export such a q4/f16.onnx model? Can you give me more tips or help? Thank you