Closed chuanzeruge closed 2 months ago
@ZachL1 Hi Zach, do you have any experience with this? I have never tested the codes in Windows environments. All experiments are conducted on Ubuntu systems.
Sorry, I lack a proper Windows environment as well. It seems to be an onnx export issue though, and @Owen-Liuyuxuan may be able to help.
@chuanzeruge cc: @JUGGHM
This is because xFormers is not exportable.
xformers
is installed.xformers
is not installed, the model will fall back to use MultiheadAttention which is exportable.My suggestion:
xformers
to export the model.'xformers
is easy to install and delete, so pip3 uninstall xformers
before exporting the model, then pip3 install xformers
afterward can help. (I did this)xformers
based on your need. @Owen-Liuyuxuan Thank you for your explanation. This is helpful for a novice like me。QvQ
I also encounter this error on Ubuntu, and if I uninstall xformers
, I get the error from #126 (about tensors on different devices).
How can I solve this, to have an ONNX model ? @Owen-Liuyuxuan @JUGGHM
Thank you
@TLescoatTFX Can you try running the model with dummy input first, without exporting to ONNX?
That error is related to the pytorch run-time instead of onnx exportation. I guess you should try performing inference without onnx exportation first, and see more detailed error logs.
Thank you for the answer, I tried to run the exported model before exporting to ONNX (via dummy_output = export_model(dummy_input)
and checking the output shape) and it seems to work correctly.
Errors only show when calling torch.onnx.export
:
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/backbones/ViT_DINO_reg.py:984: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if pad_h == self.patch_size:
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/backbones/ViT_DINO_reg.py:986: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if pad_w == self.patch_size:
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/backbones/ViT_DINO_reg.py:235: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert H % patch_H == 0, f"Input image height {H} is not a multiple of patch height {patch_H}"
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/backbones/ViT_DINO_reg.py:236: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert W % patch_W == 0, f"Input image width {W} is not a multiple of patch width: {patch_W}"
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/backbones/ViT_DINO_reg.py:910: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if npatch == N and w == h:
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/backbones/ViT_DINO_reg.py:922: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
sqrt_N = math.sqrt(N)
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/backbones/ViT_DINO_reg.py:923: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
sx, sy = float(w0) / sqrt_N, float(h0) / sqrt_N
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/backbones/ViT_DINO_reg.py:931: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert int(w0) == patch_pos_embed.shape[-2]
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/backbones/ViT_DINO_reg.py:931: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert int(w0) == patch_pos_embed.shape[-2]
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/backbones/ViT_DINO_reg.py:932: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert int(h0) == patch_pos_embed.shape[-1]
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/backbones/ViT_DINO_reg.py:932: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert int(h0) == patch_pos_embed.shape[-1]
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/decode_heads/RAFTDepthNormalDPTDecoder5.py:894: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if torch.isnan(vit_features[0]).any():
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/decode_heads/RAFTDepthNormalDPTDecoder5.py:896: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if torch.isinf(vit_features[0]).any():
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/decode_heads/RAFTDepthNormalDPTDecoder5.py:908: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if torch.isnan(en_ft).any():
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/decode_heads/RAFTDepthNormalDPTDecoder5.py:911: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if torch.isinf(en_ft).any():
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/decode_heads/RAFTDepthNormalDPTDecoder5.py:919: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if torch.isnan(ref_feat).any():
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/decode_heads/RAFTDepthNormalDPTDecoder5.py:921: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if torch.isinf(ref_feat).any():
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/decode_heads/RAFTDepthNormalDPTDecoder5.py:815: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if torch.isnan(prob).any():
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/decode_heads/RAFTDepthNormalDPTDecoder5.py:817: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if torch.isinf(prob).any():
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/decode_heads/RAFTDepthNormalDPTDecoder5.py:831: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if torch.isnan(d ).any():
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/decode_heads/RAFTDepthNormalDPTDecoder5.py:833: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if torch.isinf(d ).any():
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/decode_heads/RAFTDepthNormalDPTDecoder5.py:842: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if torch.isnan(normal_out).any():
/home/thibault/.cache/torch/hub/yvanyin_metric3d_main/mono/model/decode_heads/RAFTDepthNormalDPTDecoder5.py:844: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if torch.isinf(normal_out).any():
/home/thibault/miniconda3/envs/md/lib/python3.10/site-packages/torch/onnx/utils.py:689: UserWarning: Constant folding in symbolic shape inference fails: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select) (Triggered internally at ../torch/csrc/jit/passes/onnx/shape_type_inference.cpp:439.)
_C._jit_pass_onnx_graph_shape_type_inference(
============= Diagnostic Run torch.onnx.export version 2.0.1+cu117 =============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================
Traceback (most recent call last):
File "/workspace/Metric3D/onnx/metric3d_onnx_export.py", line 134, in <module>
Fire(main)
File "/home/thibault/miniconda3/envs/md/lib/python3.10/site-packages/fire/core.py", line 143, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/thibault/miniconda3/envs/md/lib/python3.10/site-packages/fire/core.py", line 477, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/home/thibault/miniconda3/envs/md/lib/python3.10/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/workspace/Metric3D/onnx/metric3d_onnx_export.py", line 121, in main
torch.onnx.export(
File "/home/thibault/miniconda3/envs/md/lib/python3.10/site-packages/torch/onnx/utils.py", line 506, in export
_export(
File "/home/thibault/miniconda3/envs/md/lib/python3.10/site-packages/torch/onnx/utils.py", line 1548, in _export
graph, params_dict, torch_out = _model_to_graph(
File "/home/thibault/miniconda3/envs/md/lib/python3.10/site-packages/torch/onnx/utils.py", line 1180, in _model_to_graph
params_dict = _C._jit_pass_onnx_constant_fold(
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)
(btw I don't need the normals, if removing them allows to bypass the errors I'm fine with it)
The warning you get in the normal calculation is "normal" and may not be the root cause that leads to the error.
Could you provide your cuda version, pytorch version (including the cuda version it compiles on), and onnx version?
For the Python packages:
onnx 1.16.2
onnxruntime 1.19.0
onnxruntime-gpu 1.19.0
torch 2.0.1
torchvision 0.15.2
For CUDA:
GPU: Nvidia RTX A5000
CUDA: 12.4
Driver: 550.90.07
Checking the CUDA version for Pytorch, it seems there is a mismatch, not sure if it is important
>>> torch.version.cuda
'11.7'
I installed torch 2.4 and it is working with CUDA 12.1, and the export worked ! Thank you very much !
Now, I just need to convert it to CoreML... :/
Thank you for sharing the project. I am a beginner in this field, and currently, I have encountered issues while trying to save the entire pytorch model and exporting it to onnx.
While saving the vit model with provided checkpoints file, the error is as follows
and while executing onnx(running metric3d_onnx_export.py)
I don't know if there is a certain aspect of the model architecture that doesn't support conversion.