NikolaZubic / 2dimageto3dmodel

We evaluate our method on different datasets (including ShapeNet, CUB-200-2011, and Pascal3D+) and achieve state-of-the-art results, outperforming all the other supervised and unsupervised methods and 3D representations, all in terms of performance, accuracy, and training time.
269 stars 27 forks source link
3d-computer-graphics 3d-reconstruction computer-graphics computer-vision cub-dataset deep-learning gan kaolin loss-functions mesh neural-networks pascal3d point-cloud pose-prediction pytorch rendering shapenet shapenet-dataset single-view-reconstruction voxel

An Effective Loss Function for Generating 3D Models from Single 2D Image without Rendering

PWC

Papers with code | Paper

Nikola Zubić   Pietro Lio  

University of Novi Sad   University of Cambridge

AIAI 2021

Citation

Besides AIAI 2021, our paper is in a Springer's book entitled "Artificial Intelligence Applications and Innovations": link

Please, cite our paper if you find this code useful for your research.

@InProceedings{zubic_aiai_2021,
author="Zubi{\'{c}}, Nikola
and Li{\`o}, Pietro",
title="An Effective Loss Function for Generating 3D Models from Single 2D Image Without Rendering",
booktitle="Artificial Intelligence Applications and Innovations (AIAI)",
year="2021",
publisher="Springer International Publishing",
pages="309--322",
}

Prerequisites

The results will be saved at 2dimageto3dmodel/code/results/ path.

Continue training

To continue the training process:
Run the following commands (without --save_results) from the root/code/ (2dimageto3dmodel/code/) directory:

python main.py --dataset cub --batch_size 16 --weights pretrained_weights_cub

for the CUB Birds Dataset.

python main.py --dataset p3d --batch_size 16 --weights pretrained_weights_p3d

for the Pascal 3D+ Dataset.

Generation of Pseudo-ground-truths

In these reconstruction steps, we need a trained mesh estimation model. We can use the pre-trained model (already provided) or train it from scratch. The Pseudo-ground-truth data for CUB birds is generated in the following way:

python run_reconstruction.py --name pretrained_reconstruction_cub --dataset cub --batch_size 10 --generate_pseudogt

For Pascal 3D+ dataset:

python run_reconstruction.py --name pretrained_reconstruction_p3d --dataset p3d --optimize_z0 --batch_size 10 --generate_pseudogt

Through this, we replace a cache directory, which contains pre-computed statistics for the evaluation of Frechet Inception Distances, poses and images metadata, and the Pseudo-ground-truths for each image.

Mesh generator training from scratch

Set up the Pseudo-ground-truth data as described in the section above, then execute the following command:

python main.py --name cub_512x512_class --conditional_class --dataset cub --gpu_ids 0,1,2,3 --batch_size 32 --epochs 1000 --tensorboard

Here, we train a CUB birds model, conditioned on class labels, for 1000 epochs. Every 20 epochs, we have FID evaluations (which can be changed with --evaluate_freq). Usage of different numbers of GPUs can produce slightly different results. Tensorboard allows us to export the results in Tensorboard's log directory tensorboard_gan.

After training, we can find the best model's checkpoint with the following command:

python main.py --name cub_512x512_class --conditional_class --dataset cub --gpu_ids 0,1,2,3 --batch_size 64 --evaluate --which_epoch best

Mesh estimation model training

Use the following two commands for training from scratch:

python run_reconstruction.py --name pretrained_reconstruction_cub --dataset cub --batch_size 50 --tensorboard
python run_reconstruction.py --name pretrained_reconstruction_p3d --dataset p3d --optimize_z0 --batch_size 50 --tensorboard

Tensorboard log files are saved in tensorboard_recon.

License

MIT

Acknowledgment

This idea has been built based on the architecture of Insafutdinov & Dosovitskiy.
Poisson Surface Reconstruction was used for Point Cloud to 3D Mesh transformation.
The GAN architecture (used for texture mapping) is a mixture of Xian's TextureGAN and Li's GAN.