Open hello2mao opened 3 months ago
we export llm.pt to libtorch in our production system, but this export code is not opensourced yet, we will consider it later
we export llm.pt to libtorch in our production system, but this export code is not opensourced yet, we will consider it later
is the export model has fixed input length?
A simple Time Cost Test Result:
llm encoder cost: 5.93s
llm decoder cost: 0.17s
llm forward_chunk cost: 6.11s
llm cost: 6.11s
flow cost: 0.68s
hift cost: 0.06s
total cost: 6.98s
Any way to reduce the llm encoder time cost ?(about 82% of total time)
A simple Time Cost Test Result:
llm encoder cost: 5.93s llm decoder cost: 0.17s llm forward_chunk cost: 6.11s llm cost: 6.11s flow cost: 0.68s hift cost: 0.06s total cost: 6.98s
Any way to reduce the llm encoder time cost ?(about 82% of total time)
llm encoder generates token one by one, we are also trying to reduce its computation time, maybe export to onnx or libtorch will reduce computation
we export llm.pt to libtorch in our production system, but this export code is not opensourced yet, we will consider it later
Hi, when would you update codes converting .pt to .onnx or libtorch? Thanks
And how to fix the onnx input length while the text_encoder not output a fixed length token?