zhyever / PatchFusion

[CVPR 2024] An End-to-End Tile-Based Framework for High-Resolution Monocular Metric Depth Estimation
https://zhyever.github.io/patchfusion/
MIT License
926 stars 62 forks source link

cuda out of memory - #31

Closed lulu1315 closed 3 months ago

lulu1315 commented 3 months ago

hello , i have a 12Gb memory nvidia card and trying to generate depth with your current code gives me "out of memory" error. i'm using your example command , with --cai-mode r128 , test image is 1920x1080 pixels :

python3 ./tools/test.py configs/patchfusion_zoedepth/zoedepth_patchfusion_u4k.py --ckp-path Zhyever/patchfusion_zoedepth --cai-mode r128 --cfg-option general_dataloader.dataset.rgb_image_dir='./test/' --save --work-dir ./work_dir/predictions --test-type general --image-raw-shape 1080 1920 --patch-split-num 2 2

i managed to generate a depth map on my card using your previous code with this command :

python3 ./infer_user.py --model zoedepth_custom --ckp_path nfs/patchfusion_u4k.pt --model_cfg_path ./zoedepth/models/zoedepth_custom/configs/config_zoedepth_patchfusion.json --rgb_dir /mnt/Projets/P41/depth/ --show --show_path /mnt/Projets/P41/PatchFusion/ --mode r128 --boundary 0 --blur_mask

i tried different parameters with your new code but i always get the same error.

what would be the right parameters to be able to use less than 12Gb ?

thank you in advance for any advice and thank you for your code :)

zhyever commented 3 months ago

Hi, thanks for report. I notice that I forget to introduce one argument in the doc. It's --process-num. This new codebases processes patchs in a batch manner, and we use default 4 to speed up inference with the cost of increasing the memory.

I'm setting the default 4 to 2 now because 12Gb is really large... And you can also set the process-num to look for one suitable case for your machine

Here is the updated command:

python3 ./tools/test.py configs/patchfusion_zoedepth/zoedepth_patchfusion_u4k.py --ckp-path Zhyever/patchfusion_zoedepth --cai-mode r128 --cfg-option general_dataloader.dataset.rgb_image_dir='./test/' --save --work-dir ./work_dir/predictions --test-type general --image-raw-shape 1080 1920 --patch-split-num 2 2 --process-num 2

lulu1315 commented 3 months ago

thank you . it's working now ! :)

lulu1315 commented 3 months ago

last question : in the previous version you had the --show option that would output a 8bit grayscale png version of depth. is it still possible to have this option somewhere ?

got it : --gray-scale makes the trick. sorry for the noise