How long do you train your model on ShapeNet?

autonomousvision / occupancy_networks

This repository contains the code for the paper "Occupancy Networks - Learning 3D Reconstruction in Function Space"

https://avg.is.tuebingen.mpg.de/publications/occupancy-networks

MIT License

1.49k stars 291 forks source link

How long do you train your model on ShapeNet? #78

Closed Wuziyi616 closed 4 years ago

Wuziyi616 commented 4 years ago

Hi! Thanks for your great work! I want to use the Occupancy Network on my own dataset (ModelNet40, which is very similar to ShapeNet I think), and the performance seems to be not bad so far. However, there is still one question concerning me and I hope you can kindly reply me.

The task I perform is reconstructing mesh from point cloud. Though the val IoU seems high now, the mesh it produces is not perceptually good (e.g. uncontinuous surface, noisy outliers). So I guess I should train the network for longer time in order to converge. And now I have trained it for 200k iterations and the val IoU only improves ~2% in the last 100k iterations.

Therefore, I just wonder how long do you train your OccNet on the original ShapeNet dataset? I read your paper carefully as well as the supplementary material but couldn't find the answer. Hope you can answer my question!

Wuziyi616 commented 4 years ago

It seems from your provided pre-trained weight that you trained for 2886000 iterations? Wow, then my problem is definitely caused by lack of training I guess...

b7leung commented 4 years ago

Wow, is that really true? My current rate on a GTX 1080 is 2750 iterations per hour (~6 epochs per hour), using configs/img/onet.yaml on ShapeNet which means 2,886,000 iterations would take 43 days of training. Does anyone have tips on speeding this up? Is 2750 iterations per hour considered normal, or is it slow?

Wuziyi616 commented 4 years ago

2750 iter/h is a bit slow I guess. I run on GTX 2080Ti and get a roughly 10k iter/h. I disable the visualization and set the val_every to be 10000 iters to speed up training. But it still needs a long time to converge I guess... The loss just shatters heavily and the IoU on val set increases really really slowly (1 per 100k iterations). Indeed I think it needs several days (even weeks) to converge.

aidendef commented 3 years ago

I am trying to train the onet model from the beginning in shapenet dataset.

method: onet data: path: data/ShapeNet img_folder: img_choy2016 img_size: 224 points_subsample: 2048 model: encoder_latent: null decoder: cbatchnorm encoder: resnet18 c_dim: 256 z_dim: 0 training: out_dir: out/img/onet batch_size: 64 model_selection_metric: iou model_selection_mode: maximize visualize_every: 20000 validate_every: 20000 test: threshold: 0.2 eval_mesh: true eval_pointcloud: false generation: batch_size: 100000 refine: false n_x: 128 n_z: 1 resolution_0: 32 upsampling_steps: 2

Given the above assumptions, how long does the epoch have to proceed to a level similar to that presented in the paper? Is it correct that the previously provided code(onet.yaml) provides the best performance?