Xharlie / DISN

(latest updates and bug fixed) DISN: Deep Implicit Surface Network for High-quality Single-view 3D Reconstruction
183 stars 27 forks source link

Training from Scratch #3

Open Mahsa13473 opened 4 years ago

Mahsa13473 commented 4 years ago

Hi there,

Thanks for releasing the codes, it is amazing work! I try to train the network from scratch and follow all the steps that were mentioned in Readme file, but I couldn't get the same results in comparison to pretrained model.

I was wondering which hyperparameters are used for the pretarined one. Is it the same as the defaults in train_sdf.py? How many epochs did you train to get the best accuracy? Also which dataset was used for training? The old one or the new one that you mentioned in Readme?

no-materials commented 4 years ago

Hello, in addition to @Mahsa13473 's comment, can you also provide the approximate training time?

Xharlie commented 4 years ago

hi by the time we submitted, we used the old one which everyone else used as well. We used imagenet pretrained vgg16(provided by official tensorflow release), as shown in the command in readme. We haven't tried training everything from scratch yet since i guess the dataset itself is not big enough to understand 2d image perfectly.

Xharlie commented 4 years ago

the training time can vary from 1 day to 3 days depends on your gpu. but i ll say at most 3 days. The bottleneck is on cpu since we have to read sdf ground truth and image h5 file on the fly. so if you have a better cpu or ssd for sdf/img storage, you can train them faster.

asurada404 commented 4 years ago

Hi, I'm also training the network from scratch using the pre-trained vgg16, but I can't get the same result. Did you used the pre-trained vgg16? @Mahsa13473

Mahsa13473 commented 4 years ago

Hi. Yes, but I couldn't get the same result with the pretrained vgg16. But I tried a few months ago. Not sure how it works with the updated version of code. @asurada404

JohnG0024 commented 4 years ago

Hello, anyone knows where is the pretrained modelvgg_16.ckpt? python -u train/train_sdf.py --gpu 0 --img_feat_twostream --restore_modelcnn ./models/CNN/pretrained_model/vgg_16.ckpt --log_dir checkpoint/SDF_JG --category all --num_sample_points 2048 --batch_size 20 --learning_rate 0.0001 --cat_limit 36000 gets an error: tensorflow.python.framework.errors_impl.NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ./models/CNN/pretrained_model/vgg_16.ckpt

asurada404 commented 4 years ago

Download vgg_16.ckpt and save to ./models/CNN/pretrained_model first. @JohnG0024

JohnG0024 commented 4 years ago

@asurada404 Thanks!

JohnG0024 commented 4 years ago

@Xharlie In your opinion, what's missing in the dataset that makes it unable to understand 2d image perfectly?

asurada404 commented 4 years ago

The VGG was used as an encoder to extract the features of the image. The pre-trained VGG was training on ImageNet dataset(more than 14 million images and more than 20,000 categories) which is much larger than ShapeNet. As a result, the VGG trained on ImageNet can extract image features better than the VGG trained on ShapeNet. @JohnG0024

JohnG0024 commented 4 years ago

@asurada404 That makes sense. So the vgg_16.ckpt is from the full Imagenet dataset, not the 1k subset of categories of ImageNet used in the ImageNet Challenge?

asurada404 commented 4 years ago

You can find more details in this paper @JohnG0024

AlexsaseXie commented 3 years ago

Does anyone successfully reproduce the results?

I trained the network with ground truth camera parameters. No modifications have done to the code. nohup python -u train/train_sdf.py --gpu 0 --img_feat_twostream --restore_modelcnn ./models/CNN/pretrained_model/vgg_16.ckpt --log_dir checkpoint/{your training checkpoint dir} --category all --num_sample_points 2048 --batch_size 20 --learning_rate 0.0001 --cat_limit 36000 &> log/DISN_train_all.log &

The train/test split is 3D-R2N2. I trained for about 3 days, approxiamtely 23 epochs. The sdf loss stopped dropping so I assumed the network converged. But I only got bad visuals in test set models.