chenhsuanlin / photometric-mesh-optim

Photometric Mesh Optimization for Video-Aligned 3D Object Reconstruction :globe_with_meridians: (CVPR 2019)
MIT License
208 stars 25 forks source link

Pretraining AtlasNet Fails in Chamfer Library #5

Closed ghost closed 4 years ago

ghost commented 5 years ago

Hi,

I followed the instructions to retrain AtlasNet with your new dataset (ShapeNet rendering + SUN360 backgrounds) however it seems to fail in the Chamfer library.

I use the instructions for setup as outlined in the README. My cuda version is 9.0

Here is the log.

======================================================= main_pretrain.py (pretraining with AtlasNet reimplementation)

setting configurations... H : 224 W : 224 aug_transl : None avg_frame : False batch_size : 32 batch_size_pmo : -1 category : 02691156 code : None cpu : False device : cuda:0 eval : False from_epoch : 0 gpu : 0 group : 0 imagenet_enc : True init_idx : 27 load : None log_tb : False log_visdom : False lr_decay : 1.0 lr_pmo : 0.001 lr_pretrain : 0.0001 lr_step : 100 name : 02691156_pretrain_seed0 noise : None num_meshgrid : 5 num_points : 100 num_points_all : 2500 num_prim : 25 num_workers : 8 pointcloud_path : data/customShapeNet pretrained_dec : pretrained/ae_atlasnet_25.pth rendering_path : data/rendering scale : None seed : 0 seq_path : data/sequences sfm : False size : 224x224 sphere : False sphere_densify : 3 sun360_path : data/background to_epoch : 500 to_it : 100 video : False vis_port : 8097 vis_server : http://localhost

loading training data... number of samples: 3235 loading test data... number of samples: 809 building AtlasNet... loading pretrained encoder... loading pretrained decoder (pretrained/ae_atlasnet_25.pth)... ======= TRAINING START ======= error in nnd updateOutput: invalid device function Traceback (most recent call last): File "main_pretrain.py", line 26, in trainer.train_epoch(opt,ep) File "/task_runtime/photometric-mesh-optim/model_pretrain.py", line 71, in train_epoch loss = self.compute_loss(opt,var,ep=ep) File "/task_runtime/photometric-mesh-optim/model_pretrain.py", line 59, in compute_loss dist1,dist2 = atlasnet.ChamferDistance().apply(opt,var.points_GT,var.points_pred) File "/task_runtime/photometric-mesh-optim/atlasnet.py", line 211, in forward chamfer.nnd_forward_cuda(p1,p2,dist1,dist2,idx1,idx2) File "/usr/local/lib/python3.6/dist-packages/torch/utils/ffi/init.py", line 197, in safe_call result = torch._C._safe_call(*args, **kwargs) torch.FatalError: aborting at /mnt/ilcompf6d1/user/chelin/adobe-scenemeshing/atlasnet-reimp/chamfer/src/my_lib_cuda.c:26

ghost commented 5 years ago

For now, I am able to get this running by building Chamfer from AtlasNet repository, and modifying the ChamferDistance class in atlasnet.py accordingly.

Would be nice to have the source files to build chamfer.so to avoid issues due to machine/version dependencies.

chenhsuanlin commented 5 years ago

Thanks for reporting the issue. I'll leave this issue open for now.

chenhsuanlin commented 4 years ago

The source files are now included in the repo.