isl-org / DPT

Dense Prediction Transformers
MIT License
1.96k stars 254 forks source link

ONNX Conversion Scripts #73

Open timmh opened 2 years ago

timmh commented 2 years ago

This PR implements ONNX conversion scripts and scripts to run the resulting models on monodepth and segmentation tasks. Furthermore fixes from #42 are incorporated. The converted weights are available here and are verified to produce numerically similar results to the original models on exemplary inputs. Please let me know if I should add anything to the README.

yohannes-taye commented 2 years ago

Thank you for the scripts @timmh. I have been trying to export larger size models but the script freezes and I have to kill the computer manually to restart it. The only modifications I have made are to the net_h and net_w variables. I set a value of 384 * 3 and it causes the computer to freeze completely. When setting values like 1980 or 1080 the script halts with an error 4441 Killed message. I haven't been able to find a detailed error log or crash report so I'm not able to look at exactly what is causing the problem.

timmh commented 2 years ago

Thank you for the scripts @timmh. I have been trying to export larger size models but the script freezes and I have to kill the computer manually to restart it. The only modifications I have made are to the net_h and net_w variables. I set a value of 384 * 3 and it causes the computer to freeze completely. When setting values like 1980 or 1080 the script halts with an error 4441 Killed message. I haven't been able to find a detailed error log or crash report so I'm not able to look at exactly what is causing the problem.

This sounds like you are running out of RAM. Things you could try:

dummy_input = torch.zeros((batch_size, 3, net_h, net_w)) +dummy_input = dummy_input.to("cuda")

yohannes-taye commented 2 years ago

Thank you for the response. I tested to see if I can use my GPU and made the code changes as you suggested but I got the following error. Any idea about what might be causing it? Traceback (most recent call last): File "export_monodepth_onnx.py", line 169, in <module> main(args.model_weights, args.model_type, args.output_path, args.batch_size, args.test_image_path) File "export_monodepth_onnx.py", line 100, in main dynamic_axes={"input": {0: "batch_size"}, "output": {0: "batch_size"}}, File "/home/tmc/anaconda3/envs/DPT/lib/python3.7/site-packages/torch/onnx/__init__.py", line 276, in export custom_opsets, enable_onnx_checker, use_external_data_format) File "/home/tmc/anaconda3/envs/DPT/lib/python3.7/site-packages/torch/onnx/utils.py", line 94, in export use_external_data_format=use_external_data_format) File "/home/tmc/anaconda3/envs/DPT/lib/python3.7/site-packages/torch/onnx/utils.py", line 701, in _export dynamic_axes=dynamic_axes) File "/home/tmc/anaconda3/envs/DPT/lib/python3.7/site-packages/torch/onnx/utils.py", line 503, in _model_to_graph _export_onnx_opset_version) RuntimeError: Input, output, and indices must be on the current device

timmh commented 2 years ago

@yohannes-taye I can reproduce the issue but to be honest I have no idea where the issue stems from. Probably there is some tensor in the model which is created on the wrong device. I think the best way forward for you would be to increase your swap space and export on the CPU.