zhyever / PatchFusion

[CVPR 2024] An End-to-End Tile-Based Framework for High-Resolution Monocular Metric Depth Estimation
https://zhyever.github.io/patchfusion/
MIT License
926 stars 62 forks source link

Training Error #37

Open shilpaullas97 opened 1 month ago

shilpaullas97 commented 1 month ago

Hi @zhyever ,

I recently tried out patchfusion model using this repo. Currently I'm trying to run the training script to train a model by myself.

Following the training steps in [https://github.com/zhyever/PatchFusion/blob/main/docs/user_training.md] , I was able to run coarse and fine model training for depth_anything_vitb model. But facing the below error while running the training for fusion model.

rank0: File "PatchFusion/estimator/trainer/trainer.py", line 32 6, in run

rank0: File "PatchFusion/estimator/trainer/trainer.py", line 25 0, in train_epoch

rank0: File "lib/python3.8/site-packages/mmengine/optim/optimizer/optimizer_wrapper.py", line 196, in update_params

rank0: File "lib/python3.8/site-packages/mmengine/optim/optimizer/optimizer_wrapper.py", line 220, in backward

rank0: File "lib/python3.8/site-packages/torch/_tensor.py", line 525, in backward

rank0: File "lib/python3.8/site-packages/torch/autograd/init.py", line 260, in backwa rd
rank0: gradtensors = _make_grads(tensors, gradtensors, is_grads_batched=False)
rank0: File "lib/python3.8/site-packages/torch/autograd/init.py", line 133, in make grads
rank0: raise RuntimeError(
rank0: RuntimeError: grad can be implicitly created only for scalar outputs

Could you please give some inputs on this? Is there anything to be modified on the script?

One more question out of this. Do we have any onnx/tensorrt or any other deployment model version for patchfusion?

zhyever commented 3 weeks ago

Sorry for the late reply. Just came back from one conference. Would you mind to check the shape of the backward loss? Which version of torch are you using?

I will work for deployment models for patchrefiner, which is our follow-up work of patchfusion.