Closed salahzoubi closed 4 weeks ago
I'm wondering if you can share the steps you took in the Dockerfile or your environment to build the onnx runtime gpu inference from the gridsample branch? I've been trying to build for some time now on different machines (H100, 3090) and different CUDA toolkits (11.8, 12.1, 12.5) and none seem to work. The docker version you share doesn't seem to work either, it just gets stuck on loading the model for a long time and never responds...
In particular, I get the same error these people are getting, and it seems like there's no support from there onwards...
Any idea on how to go from here?
have you try
./build.sh --parallel \ --build_shared_lib --use_cuda \ --cuda_version 11.8 \ --cuda_home /usr/local/cuda --cudnn_home /usr/local/cuda/ \ --config Release --build_wheel --skip_tests \ --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES="60;70;75;80;86" \ --cmake_extra_defines CMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc \ --disable_contrib_ops \ --allow_running_as_root
and show me the error log?
Yes, using this exact command, I get the cmake error in the picture here:
https://github.com/microsoft/onnxruntime/issues/18942#issuecomment-2231982643
My only real problem from this entire issue is the warping_spade-fix.onnx
model. I'm guessing this model is a warping model first and then a spade model merged into a single model right where the 'out' output of the warping model is passed to the spade decoder?
Yes, I've combined the two models together. Additionally, if you use ONNX Runtime, you should use the model file named warping_spade.onnx instead of warping_spade-fix.onnx.
@warmshao Understood. What I'm trying to do is convert the model to batched. I have a script which converts onnx static to dynamic batches directly, but it doesn't work on the warping_spade.onnx because of grid sample. So now I'm wondering how to make that happen...
@warmshao is there any chance you can share a dynamic inputs warping_spade onnx file? I'll probably try that out and convert to trt later on...
It should be as easy as running the script below if you have grid sample working on your onnx...
Would highly appreciate!
import onnx
import os
import struct
from argparse import ArgumentParser
def rebatch(infile, outfile, batch_size):
model = onnx.load(infile)
graph = model.graph
# Change batch size in input, output and value_info
for tensor in list(graph.input) + list(graph.value_info) + list(graph.output):
tensor.type.tensor_type.shape.dim[0].dim_param = batch_size
# Set dynamic batch size in reshapes (-1)
for node in graph.node:
if node.op_type != 'Reshape':
continue
for init in graph.initializer:
# node.input[1] is expected to be a reshape
if init.name != node.input[1]:
continue
# Shape is stored as a list of ints
if len(init.int64_data) > 0:
# This overwrites bias nodes' reshape shape but should be fine
init.int64_data[0] = -1
# Shape is stored as bytes
elif len(init.raw_data) > 0:
shape = bytearray(init.raw_data)
struct.pack_into('q', shape, 0, -1)
init.raw_data = bytes(shape)
onnx.save(model, outfile)
if __name__ == '__main__':
parser = ArgumentParser('Replace batch size with \'N\'')
parser.add_argument('infile')
parser.add_argument('outfile')
args = parser.parse_args()
rebatch(args.infile, args.outfile, 'N')
@warmshao is there any chance you can share a dynamic inputs warping_spade onnx file? I'll probably try that out and convert to trt later on...
It should be as easy as running the script below if you have grid sample working on your onnx...
Would highly appreciate!
import onnx import os import struct from argparse import ArgumentParser def rebatch(infile, outfile, batch_size): model = onnx.load(infile) graph = model.graph # Change batch size in input, output and value_info for tensor in list(graph.input) + list(graph.value_info) + list(graph.output): tensor.type.tensor_type.shape.dim[0].dim_param = batch_size # Set dynamic batch size in reshapes (-1) for node in graph.node: if node.op_type != 'Reshape': continue for init in graph.initializer: # node.input[1] is expected to be a reshape if init.name != node.input[1]: continue # Shape is stored as a list of ints if len(init.int64_data) > 0: # This overwrites bias nodes' reshape shape but should be fine init.int64_data[0] = -1 # Shape is stored as bytes elif len(init.raw_data) > 0: shape = bytearray(init.raw_data) struct.pack_into('q', shape, 0, -1) init.raw_data = bytes(shape) onnx.save(model, outfile) if __name__ == '__main__': parser = ArgumentParser('Replace batch size with \'N\'') parser.add_argument('infile') parser.add_argument('outfile') args = parser.parse_args() rebatch(args.infile, args.outfile, 'N')
You can convert it yourself, this is the script for converting to ONNX::https://github.com/warmshao/LivePortrait/blob/anim/export_onnx.py
Thanks so much for uploading the script! I'm gonna close this for now and circle back if it doesn't work!!
I'm wondering if you can share the steps you took in the Dockerfile or your environment to build the onnx runtime gpu inference from the gridsample branch? I've been trying to build for some time now on different machines (H100, 3090) and different CUDA toolkits (11.8, 12.1, 12.5) and none seem to work. The docker version you share doesn't seem to work either, it just gets stuck on loading the model for a long time and never responds...
In particular, I get the same error these people are getting, and it seems like there's no support from there onwards...
Any idea on how to go from here?