fabio-sim / Depth-Anything-ONNX

ONNX-compatible Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Apache License 2.0
196 stars 20 forks source link

Metric Depth ONNX #16

Closed dnflslwlq closed 2 days ago

dnflslwlq commented 2 weeks ago

Hello, @fabio-sim

I am trying to use the metric depth estimation pretrained model (vkitti) from Depth Anything V2 to convert it to ONNX and perform inference. (test : kitti dataset)

The relative depth did not show significant performance degradation. However, when converting the metric depth to ONNX and performing inference, the performance degradation was very severe.

I tried converting it with Dynamo and using a general script, but the result was the same.

What could be the cause of this?

The backbone model structures for relative and metric are not different, so why does the performance significantly degrade when converting to ONNX? I conducted the conversion using FP32. (not FP16)

The following results are from relative ONNX and metric ONNX, respectively.

0000000000

0000000000

fabio-sim commented 1 week ago

Hi @dnflslwlq, thank you for your interest in Depth-Anything-ONNX.

Usually I observe a rtol difference of 1e-4 to 1e-3 between PyTorch & ONNXRuntime results, which is due to differing implementations of the underlying operations.

However, I'm not sure how the degradation could become that large. I added support for metric depth models in 3128cb9 as well as the corresponding models in https://github.com/fabio-sim/Depth-Anything-ONNX/releases/tag/v2.0.0. It could be that there's a difference that I haven't taken into account..

dnflslwlq commented 1 week ago

@fabio-sim Thank you for your response.

The solution I found was to set the dummy input to a multiple of 14 that is closest to the test data resolution, instead of 518x518, when converting to ONNX. By doing so, the degradation was almost negligible.

Thank you for your interest and efforts in solving this issue together.