DEMO Error : can't run ref_vertices.expand(batch_size, -1, -1)

oscarfossey commented 2 years ago

Hi , I really liked your paper and wanted to try out the demo. I run the exact line from DEMO.md and got the following error:

Traceback (most recent call last):
  File "/home/oscar/Workspace/bstro/bstro/./metro/tools/demo_bstro.py", line 302, in <module>
    main(args)
  File "/home/oscar/Workspace/bstro/bstro/./metro/tools/demo_bstro.py", line 296, in main
    run_inference(args, _bstro_network, smpl, mesh_sampler)
  File "/home/oscar/Workspace/bstro/bstro/./metro/tools/demo_bstro.py", line 88, in run_inference
    _, _, pred_contact = BSTRO_model(images, smpl, mesh_sampler)
  File "/home/oscar/anaconda3/envs/bstro2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/oscar/Workspace/bstro/bstro/metro/modeling/bert/modeling_bstro.py", line 203, in forward
    ref_vertices = ref_vertices.expand(batch_size, -1, -1)
RuntimeError: The expanded size of the tensor (1) must match the existing size (30) at non-singleton dimension 0.  Target sizes: [1, -1, -1].  Tensor sizes: [30, 431, 3]

My setup: Python 3.10.5 Pytorch 1.11.0 torchvision 0.12.0 cuda 11.3.1

paulchhuang commented 2 years ago

Hi,

Unfortunately I didn't encounter the errors you have. Before line 203, the tensor ref_vertices's shape should be [1, 431, 3], which will be expand to [batch_size, 431, 3] by ref_vertices = ref_vertices.expand(batch_size, -1, -1).

Tensor sizes: [30, 431, 3] looks like batch_size=30 but the current demo code supports only batch_size=1, which is kinda perplexing.

Aligning the dependencies' versions may help.

Thanks!

oscarfossey commented 2 years ago

Hi, thanks for answering. I managed to make it run, forcing some dimensions to 1. There is a pseudo-batch_size = 30 all over the repository. When I set those to 1 everything worked well.

I have an other question, for the demo we run bstro with those arguments:

--num_hidden_layers 4
--num_attention_heads 4

It seems like for hidden_layers and attention_heads we can go up to 12. Is it possible to change those arguments and still run the demo? Which architecture gave the best results during your benchmarks?

hosseinfeiz commented 2 years ago

Hi, Having the same issue here. I tried changing the batch_size parameters as you said. still receiving the same error. Can you share you fork which is working? Thanks

oscarfossey commented 2 years ago

Hi, For the "debatchification" look at the last two commits on my forked repo. Code is ugly but working for me, also I did not manage to display the outputs aside so I directly render them with trimesh.

https://github.com/oscarfossey/bstro

paulchhuang commented 2 years ago

Hi, a few quick reply below:

I get a chance to test the whole installation on another clean-state machine from scratch, following the docs/INSTALL.md. The code still runs w/o getting this pseudo-batch_size = 30 error. My best guest now is this could be an env-dependent issue. Will try @oscarfossey's setting next time. Can @hosseinfeiz also share the setting as well? Before we have a generic solution, I'll put a FAQ and point to @oscarfossey's reply above. Is this ok?
I didn't experiment with different --num_hidden_layers and --num_attention_heads. These parameters are the same as the setup in METRO.

oscarfossey commented 2 years ago

Hi,

Okey for me, thanks for the precise answers.

paulchhuang commented 2 years ago

and forgot one point:

contact_vis.obj is the final visualization generated by the code. This image is made by putting it in MeshLab, taking a screenshot and putting them side by side. I should be more clear in the instruction.

qinb commented 2 years ago

I also meet this problem. As for me, the reason is that the smpl shapedir's dimension is 300 instead of 10.
So the easy solution is changing self.shapedirs.view(-1,10) to self.shapedirs[:,:, :10].view(-1,10) in https://github.com/paulchhuang/bstro/blob/main/metro/modeling/_smpl.py#L74

paulchhuang commented 2 years ago

Hi, I confirm I can reproduce the reported error using 300-shapedir smpl model, and @qinb's workaround solves the issue. In a nutshell, 300 after view(-1, 10) just leads to a virtual batch_size 30=300/10.

I was following the instructions in METRO and prepared the instructions in BSTRO in the same way. Didn't expect users to re-use their existing SMPL model files. If @hosseinfeiz @oscarfossey can confirm this addresses their issue, I can quickly push a hot fix and big thanks to @qinb for the pointer!

qinb commented 2 years ago

@paulchhuang Hi, Could you give some advice for your other repo? https://github.com/muelea/selfcontact/issues/9 1、The speed is too slow when running run_selfcontact_optimization.py, excepting decreasing the parameter - maxiter, what has the other method? 2、ProHMR serves as a pose estimator, does the repo of selfcontact merge together with ProHMR for joint training? for examples, selfcontact serves as loss supervisor? Thanks again

paulchhuang / bstro

DEMO Error : can't run ref_vertices.expand(batch_size, -1, -1) #3