jmhb0 / view_neti

[ECCV 2024] Viewpoint Textual Inversion: Discovering Scene Representations and 3D View Control in 2D Diffusion Models
101 stars 2 forks source link

The training result is not as expected. #4

Open MnKnight1 opened 11 months ago

MnKnight1 commented 11 months ago

Your work is a very meaningful attempt at 3D generation! I have trained on 88 scenarios in the DTU dataset ,according to the training parameters you provide in input_configs/train_m3.yaml . The iteration of the training is set to 60000, but the training results do not achieve the desired results. The results in the 60k iterations are as follows (I have added `cam_idxs=[0, 10, 15, 26, 31, 40, 45,19] after line 80 to reduce the output picture size) https://github.com/jmhb0/view_neti/blob/8dd8e27efcc06d762deb9a56e8ab81e01a761eaf/training/validate.py#L79-L80 validation-iter_60000-denoisesteps_30_objecttoken__tin__upsample_1_imgs_recon validation-iter_60000-denoisesteps_30_objecttoken__skull__upsample_1_imgs_recon

jmhb0 commented 9 months ago

Hey @MnKnight1 MnKnight1, thanks for raising the issue.

I re-trained the model on the 88 DTU scenes with config input_configs/train_m3.yaml. I updated inference code to handle mode 3, and checked them against the two scenes from your screenshot:

python scripts/inference.py --config_path input_configs/inference.yaml --input_dir results/issue4_train_m3_88scenes_aug5_   --iteration 60000   --eval_placeholder_object_tokens '["<scan65>", "<scan97>"]' --seeds '[0]'

The results are a lot more reasonable: image image

(If you want to double check, I put the models in https://drive.google.com/file/d/1N1vFKia6wSiHXK9GArEzBkulp9TGf48R/view?usp=sharing into Uncompress them into results/issue4_train_m3_88scenes_aug5, and the code above should work).

It turns out when I first released input_configs/train_m3.yaml, I had data.augmentation_key=7 (which was the best setting for modes 4/5), but it should have been 5. The augmentation key 7 has stronger image augmentations (you can check them in training/dataset.py). Thanks for highlighting the problem.

BTW, I checked results using augmentation 7, and while the results were worse, they were not as bad as in your screenshot, so I’m not sure what’s going on there … But hopefully this update resolves your problem.