When will you release the training script

s0rrymaker77777 commented 1 year ago

Hi, thanks for your great work. I want to know when you can release the training script and more details. Btw, when I try 'Render Meshes for Reconstructed Objects, Egocentric Camera (Blue), and 3rd-Person-Follow Camera (Yellow)', --render_cam seems not a flag.

andrewsonga commented 1 year ago

The paper was recently accepted to ICCV 2023, so I am planning to release the training script, full dataset, and full suite of pretrained models after the camera-ready deadline.

As for the commands in 'Render Meshes for Reconstructed Objects, Egocentric Camera (Blue), and 3rd-Person-Follow Camera (Yellow)', you are correct. --render_cam is not a flag - it should have been written as either --render_cam_inputview or --render_cam_stereoview. Thank you so much for pointing this out. It will be fixed immediately.

s0rrymaker77777 commented 1 year ago

Thanks for your reply. I have another question about rendering, when I try to do egocentric view synthesis, the results seem strange. Please check this

andrewsonga commented 1 year ago

I'll get on this after the ICCV camera-ready deadline, but it seems like the camera is either looking at the surface of the dog or actually lies inside the dog. My guess is that the command line parameters in add_args have not been passed to the view synthesis script properly, but I don't have enough bandwidth to debug this right now.

andrewsonga commented 1 year ago

@s0rrymaker77777 I've finally found the time to debug 1) Egocentric View Synthesis and 2) 'Render Meshes for Reconstructed Objects, Egocentric Camera (Blue), and 3rd-Person-Follow Camera (Yellow)'.

For egocentric view synthesis, I ran the same commands as outlined in the README.md and was able to render the following:

output

The exact commands I ran were as follows:

gpu=0; seqname=humandog-stereo000-leftcam-jointft; add_args='--fg_obj_index 1 --asset_obj_index 1 --fg_normalbase_vertex_index 96800 --fg_downdir_vertex_index 1874 --asset_scale 0.003 --firstpersoncam_offset_z 0.05 --firstpersoncam_offsetabt_xaxis 15 --firstpersoncam_offsetabt_zaxis 0 --asset_offset_z -0.05 --scale_fps 0.50'

bash scripts/render_nvs_fgbg_fps.sh $gpu $seqname $add_args

Try running these commands again and let me know if you still have problems. Also, make sure you have already run bash extract_fgbg.sh $gpu $seqname, which extracts the object-level meshes and root-body poses from the trained model and is required for embodied view synthesis.

For rendering the meshes of reconstructed objects and embodied cameras, the README.md was missing a few important flags in add_args for the humandog sequence. I have updated add_args and was able to render the following video:

https://github.com/andrewsonga/Total-Recon/assets/20153928/63ca7265-c42d-46d7-a7d4-2638faa8a888

s0rrymaker77777 commented 1 year ago

Thanks for your reply. I am reading your code these days. I want to ask you if you didn’t use the loss in banmo.py, but the loss in scene.py. Thanks.

andrewsonga commented 1 year ago

@s0rrymaker77777 The losses in banmo.py are used during the pretraining stage of the object fields and the losses in scene.py are used in the joint-finetuning stage. Note that we only use the color, depth, flow reconstruction losses as well as per-object cycle consistency losses during the joint-finetuning stage (all other losses such as the silhouette loss, entropy loss, projection loss, and feature losses are not used even though they are implemented in scene.py).

All of this information will be further elaborated in the impending training code release, which will occur sometime before next monday.

andrewsonga commented 1 year ago

@s0rrymaker77777 The training code, the full dataset, and all pre-optimized models have been released.

s0rrymaker77777 commented 1 year ago

Thanks for you help

andrewsonga / Total-Recon

When will you release the training script #5