Open SuCicada opened 1 year ago
You can use this code to reduce the size of pth file from 699MB to 233MB !!
import torch
import sys
import os
pth_file = "xxxxxxx"
output_file = "xxxxxx"
device="cuda" # or mps / cpu
model = torch.load(pth_file, map_location=torch.device(device))
print("origin file size :", os.path.getsize(pth_file))
# remove the model["optimizer"] which is not used for inference.
torch.save({
"model": model["model"],
"learning_rate": model["learning_rate"],
"iteration": model["iteration"],
}, output_file)
print("inference file size: ", os.path.getsize(output_file))
第一个办法:弃置模型的训练用权重(推理时用不到的权重),比方说enc_q 第二个方法:导出onnx,使用MoeSS推理
I hope this message finds you well. I am writing to inquire about how to reduce the size of the result file of PTH. I only want to use the .pth file for inference models, but it is now 600MB+. I want to reduce its size.
Specifically, I am wondering if there are any recommended techniques or best practices for optimizing the size of the output file.
请帮我: 想咨询一下如何减小
PTH
结果文件的大小。 我只想将 .pth 文件用于推理模型,但现在已超过 600MB。 请问是否有任何推荐的技术或提供学习方向来优化输出文件的大小。