IM-NET-pytorch

PyTorch 1.2 implementation for paper "Learning Implicit Fields for Generative Shape Modeling", Zhiqin Chen, Hao (Richard) Zhang.

Improvements

In short, this repo is an implementation of IM-NET with the framework provided by BSP-NET-pytorch.

The improvements over the original implementation is the same as IM-NET (improved TensorFlow1 implementation):

Encoder:

In IM-AE (autoencoder), changed batch normalization to instance normalization.

Decoder (=generator):

Changed the first layer from 2048-1024 to 1024-1024-1024.
Changed latent code size from 128 to 256.
Removed all skip connections.
Changed the last activation function from sigmoid to clip ( max(min(h, 1), 0) ).

Training:

Trained one model on the 13 ShapeNet categories as most Single-View Reconstruction networks do.
For each category, sort the object names and use the first 80% as training set, the rest as testing set, same as AtlasNet.
Reduced the number of sampled points by half in the training set. Points were sampled on 256³ voxels.
Removed data augmentation (image crops), same as Occupancy Networks.
Added coarse-to-fine sampling for inference to speed up testing.
Added post-processing to make the output mesh smoother. To enable, find and uncomment all "self.optimize_mesh(vertices,model_z)".

Citation

If you find our work useful in your research, please consider citing:

@article{chen2018implicit_decoder,
  title={Learning Implicit Fields for Generative Shape Modeling},
  author={Chen, Zhiqin and Zhang, Hao},
  journal={Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2019}
}

Dependencies

Requirements:

Python 3.5 with numpy, scipy and h5py
PyTorch 1.2
PyMCubes (for marching cubes)

Our code has been tested on Ubuntu 16.04 and Windows 10.

Datasets and pre-trained weights

The original voxel models are from HSP.

The rendered views are from 3D-R2N2.

Since our network takes point-value pairs, the voxel models require further sampling.

For data preparation, please see directory point_sampling.

We provide the ready-to-use datasets in hdf5 format, together with our pre-trained network weights.

IM-NET-pytorch

Backup links:

IM-NET-pytorch (pwd: bqex)

Usage

Please use the provided scripts train_ae.sh, train_svr.sh, test_ae.sh, test_svr.sh to train the network on the training set and get output meshes for the testing set.

To train an autoencoder, use the following commands for progressive training.

python main.py --ae --train --epoch 200 --sample_dir samples/all_vox256_img0_16 --sample_vox_size 16
python main.py --ae --train --epoch 200 --sample_dir samples/all_vox256_img0_32 --sample_vox_size 32
python main.py --ae --train --epoch 200 --sample_dir samples/all_vox256_img0_64 --sample_vox_size 64

The above commands will train the AE model 200 epochs in 16³ resolution, then 200 epochs in 32³ resolution, and finally 200 epochs in 64³ resolution. Training on the 13 ShapeNet categories takes about 3 days on one GeForce RTX 2080 Ti GPU.

After training, you may visualize some results from the testing set.

python main.py --ae --sample_dir samples/im_ae_out --start 0 --end 16

You can specify the start and end indices of the shapes by --start and --end.

To train the network for single-view reconstruction, after training the autoencoder, use the following command to extract the latent codes:

python main.py --ae --getz

Then use the following commands to train the SVR model:

python main.py --svr --train --epoch 1000 --sample_dir samples/all_vox256_img1

After training, you may visualize some results from the testing set.

python main.py --svr --sample_dir samples/im_svr_out --start 0 --end 16

License

This project is licensed under the terms of the MIT license (see LICENSE for details).

czq142857 / IM-NET-pytorch

readme