Closed leejaeyong7 closed 2 years ago
Hi, input images are 2400x1600, output depths are 1200x800
Thanks!
@jzhangbs can you provide the running script with specific parameters (probability thresholds, number of consistent views, etc...)?
Hi @TruongKhang
depth inference:
--num_src 20 \
--max_d 512 \
--interval_scale
depth fusion: --view 20 \ --vthresh 2 \ --pthresh 0.1,0.1,0 \
@jzhangbs thank you so much for your response!
Can you provide your preprocessed pair.txt
file? Because the number of views for each scene in the ETH3D high-resolution dataset is relatively small, even some scenes have less than 10 views.
I tried your depth inference arguments. I met the out-of-memory problem when I run depth inference on a 3090 GPU. It seems num_src=20 makes the model consume a huge amount of GPU memory. Is it possible to set num_src to a smaller value?
I was only able to train the model when I set num_src to a number smaller than 4.
@TruongKhang Terribly sorry for the delay because github didn't notice me the reply.
The general
dataloader should support the situation where the total number of sources in pair.txt
is less than the number of sources in the argument. In this case the program will use all the available views. And you can use the improved fusion code which also supports this situation: https://github.com/jzhangbs/pcd-fusion
@xy-guo Try reducing the spatial resolution or depth numbers. In inference we calculate the cost volumes one by one so this part does not grow with the number of sources.
Hi,
Thanks for sharing the great work. I was wondering what the image resolution was used in the inference of the Vis-MVSNet in ETH3D high-res evaluation?
Thanks!