KelianB / SPARK

Official implementation for the SIGGRAPH Asia 2024 paper SPARK: Self-supervised Personalized Real-time Monocular Face Capture
Other
305 stars 10 forks source link

Expected 2D or 3D (batch mode) tensor with possibly 0 batch size and other non-zero dimensions for input, but got: [1, 3, 0] #7

Open alphazhugit opened 1 week ago

alphazhugit commented 1 week ago

When I processed the custom data and specified the last step: python make_comparison_video.py --config configs/example_smirk.txt, I encountered the following error:

"SPARK/TrackerAdaptation/adapt/general_utils.py", line 94, in apply_featurewise_conv1d padded_signal = torch.nn.functional.pad(signal, (padding, padding), mode=pad_mode) RuntimeError: Expected 2D or 3D (batch mode) tensor with possibly 0 batch size and other non-zero dimensions for input, but got: [1, 3, 0]" I currently don't have much of a clue, and I might need some help.

KelianB commented 1 week ago

That last dimension being zero (in "[1, 3, 0]") means that it's trying to process 0 frames. Have you checked the outputs of the previous steps, e.g. MultiFLARE training or even MonoFaceCompute outputs?

alphazhugit commented 1 week ago

That last dimension being zero (in "[1, 3, 0]") means that it's trying to process 0 frames. Have you checked the outputs of the previous steps, e.g. MultiFLARE training or even MonoFaceCompute outputs?

I carefully reviewed the previous steps and they should all be correct. This includes the MonoFaceCompute step, which successfully generates 6 folders of resources. Additionally, after completing all the training steps, resources such as "network_weights," "DECA_MultiFLARE," "SMIRK_MultiFLARE," etc., are also obtained. However, the issue arises at the final make_comparison_video.py step. So, it seems that this step is not that crucial, right? It's just a preview? I saw the final generated obj model, but I couldn't find the Albedo Texture. Was it generated? Or could you upload your Test Data?

KelianB commented 1 week ago

The make_comparison_video.py script generates a video with the results of multiple methods side-by-side, for qualitative evaluation. While not critical, it's still important as it's the main way to visualize the final tracking result. Would you mind sending your MultiFLARE and TrackerAdaptation configs? I haven't managed to replicate this issue on my end.

As for the albedo, I haven't taken the time to export it with the mesh for now. I think it shouldn't be too hard to implement:

However the albedo is going to look quite rough, since the aim of this method is geometry tracking, not detailed appearance. Also, to render the avatar you would need to export the lighting as an env map and implement the shading in the rendering software you're using. This would require a bit of work.

alphazhugit commented 5 days ago

The make_comparison_video.py script generates a video with the results of multiple methods side-by-side, for qualitative evaluation. While not critical, it's still important as it's the main way to visualize the final tracking result. Would you mind sending your MultiFLARE and TrackerAdaptation configs? I haven't managed to replicate this issue on my end. make_comparison_video.py 脚本生成一个视频,展示多种方法的结果并排,用于定性评估。虽然不是关键,但仍然很重要,因为这是可视化最终跟踪结果的主要方式。你介意发送你的 MultiFLARE 和 TrackerAdaptation 配置吗?我这边还没有成功复现这个问题。

As for the albedo, I haven't taken the time to export it with the mesh for now. I think it shouldn't be too hard to implement:至于反照率,我现在还没有花时间将它与网格一起导出。我认为实现起来应该不难:

  • Rasterize mesh UVs at the desired resolution,将网格 UVs 在所需分辨率下进行光栅化
  • For each (u,v) point, interpolate the 3D position (x,y,z),对于每个(u,v)点,插值 3D 位置(x,y,z)
  • Query the material MLP at (x,y,z), returning albedo, roughness and specular intensity,查询点(x,y,z)处的材料 MLP,返回漫反射率、粗糙度和镜面强度
  • Write the output at position (u,v) in a texture map.在纹理图中将输出写入位置(u,v)。

However the albedo is going to look quite rough, since the aim of this method is geometry tracking, not detailed appearance. Also, to render the avatar you would need to export the lighting as an env map and implement the shading in the rendering software you're using. This would require a bit of work.然而,反照率看起来会很粗糙,因为这种方法的目标是几何追踪,而不是详细的视觉效果。此外,要渲染这个头像,你需要将光照导出为环境贴图,并在你使用的渲染软件中实现着色。这需要一点工作。

Thank you very much for your response. I actually don't mind sending you my configuration files, but due to network restrictions, I'm unable to upload any files from my computer to the internet. However, I can tell you that for the MultiFLARE and TrackerAdaptation configurations, I only modified the input_dir and output_dir in the example.txt file. The input_dir points to the output folder generated by MonoFaceCompute. For MonoFaceCompute, I used six 5-second clips of Tucker Carlson. The output from MonoFaceCompute doesn't seem to have any obvious errors. In the TrackerAdaptation configuration files, I only modified the path for the config file of the multiflare parameter at the beginning. If you need more information, I'd be happy to provide it.

alexrogozea commented 1 day ago

This happens when the argument --n_frames has more frames than your test_dirs.