Open Twilight89 opened 1 year ago
The "hero model" with metadata that we provide only supports 7 source views (8 views total). To use the repo with another number of source views you'll have to either
Here's a longer explanation for 3. Say you have generated a tuple of size 4, but your model uses 8. Duplicate source views using the three you already have. Concretely:
Tuple of four:
frame_99, frame_80, frame_76, frame_70
For your tuple of eight, just randomly duplicate source views from the tuple of four: frame_99, frame_80, frame_76, frame_70. frame_70, frame_76, frame_76. frame_80
The scripts for tuple generation do some version of this duplication for test tuples when the number of tuples is in the list isn't enough. You can insert a little flag in there to force this behavior. Change n_measurement_frames at this line to three for three source views (four views in total). The block at this line will perform the random sample repeat for you.
Thanks for your detailed and early reply! I have tried using the dot product model and successfully generate pred_depth with 2 images in a tuple. So the model contains two parts: metadata and 2D CNN, right? But if want to train a new HERO_MODEL, I also have to train both parts since it is end-to-end?
Great!
For the hero model, yes. You'd need to retrain the entire model end to end.
To elaborate on that. The feature volume's MLP does collapse features down to a single value, and from all my experiments, the MLP does end up learning some form of higher is better score.
Maybe you can get away with training a 4 view volume with a frozen 8 view decoder network? You'd need to try it. Do let us know if you do! Would be cool to learn from that.
@mohammed-amr Hi, thanks again for your elaboration. I'll try to do that! I also want to try whether simplerecon can run on a mobile device (as you say in the paper, it is potentially enabling use in embedded and resource-constrained environments). What I want to do is to lower the memory usage and inference time. That's why I want to try 2 views in a tuple. There are two questions for me now:
Could you give me some suggestions about them? I will try my best to it !
Hi, thanks for your wonderful job and readily code!
In my case, I want to use arbitrary views in a tuple, like from 2 to 8. And I have noticed that in your original paper, you did the experiment to test the influence of view num.
So do I have to retrain a model to suit one particular view num in a tuple? Or I just need one model like HERO_MODEL to test different _num_images_intuple.
I try to directly change the model_num_views in options.py【of course I generate data_split file with num_images_in_tuple: 2】, but when I run, the terminal shows that Number of source views: 7 and I got shape error like below.
########################## Using FeatureVolumeManager ########################## Number of source views: 7
Using all metadata.
Number of channels: [202, 128, 128, 1]
################################################################################
0%| | 0/37 [00:27<?, ?it/s] 0%| | 0/1 [00:27<?, ?it/s] Traceback (most recent call last): File "/root/simplerecon/test.py", line 473, in
main(opts)
File "/root/simplerecon/test.py", line 270, in main
outputs = model(
File "/root/.pyenv/versions/simplerecon/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, kwargs)
File "/root/simplerecon/experiment_modules/depth_model.py", line 361, in forward
cost_volume, lowestcost, , overall_mask_bhw = self.cost_volume(
File "/root/.pyenv/versions/simplerecon/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, *kwargs)
File "/root/simplerecon/modules/cost_volume.py", line 360, in forward
self.build_cost_volume(
File "/root/simplerecon/modules/cost_volume.py", line 727, in build_cost_volume
feature_b1hw = self.mlp(
File "/root/.pyenv/versions/simplerecon/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(input, kwargs)
File "/root/simplerecon/modules/networks.py", line 147, in forward
return self.net(x)
File "/root/.pyenv/versions/simplerecon/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, *kwargs)
File "/root/.pyenv/versions/simplerecon/lib/python3.9/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "/root/.pyenv/versions/simplerecon/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(input, **kwargs)
File "/root/.pyenv/versions/simplerecon/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 103, in forward
return F.linear(input, self.weight, self.bias)
File "/root/.pyenv/versions/simplerecon/lib/python3.9/site-packages/torch/nn/functional.py", line 1848, in linear
return torch._C._nn.linear(input, weight, bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (393216x46 and 202x128)
So I assume that for different view num, I have to train a particular model to suit the size, is that right?
Hope to hear from you! THX:)