Closed matveymor closed 3 years ago
Thanks.
Can you please tell us how you calculated depth maps in your work?
I used pyrender to render the depthmaps given the 3D mesh.
When I am running the training process on my own data, this error is raised: ...
What GPU are you using? The problem is raised in https://github.com/intel-isl/StableViewSynthesis/blob/main/ext/mytorch/include/common_cuda.h#L169-L171 and it could be that the default kernel parameters are problematic with respect to your GPU. If this is the problem, you could try to change https://github.com/intel-isl/StableViewSynthesis/blob/main/ext/mytorch/include/common_cuda.h#L109 to a smaller number.
Hey @griegler, I tried setting CUDA_NUM_THREADS to a smaller value. I also tried changing the nvcc flags in setup.py. It didn't help at all. It would be great if you can suggest some other fix for this issue
@KaLiMaLi555 do you have more information, e.g., error log. Can you post also the command that you execute.
I ran the cmd which was provided in the README
python exp.py --net resunet3.16_penone.dirs.avg.seq+9+1+unet+5+2+16.single+mlpdir+mean+3+64+16 --cmd eval --iter last --eval-dsets tat-subseq
Library versions:
torch==1.6.0
torch-geometric==1.7.1
torch-scatter==2.0.5
torch-sparse==0.6.8
torchvision==0.7.0
I wasn't able to run the code with some versions of these libs given in the README. These versions seemed to work for me
Error log:
[2021-06-25/05:51/INFO/mytorch] Set seed to 42
[2021-06-25/05:51/INFO/mytorch] ================================================================================
[2021-06-25/05:51/INFO/mytorch] Start cmd "eval": tat-wo-val_bs1_nbs3_rpointdir_s0.25_resunet3.16_penone.dirs.avg.seq+9+1+unet+5+2+16.single+mlpdir+mean+3+64+16_vgg
[2021-06-25/05:51/INFO/mytorch] 2021-06-25 05:51:01
[2021-06-25/05:51/INFO/mytorch] host: ip-172-31-44-59
[2021-06-25/05:51/INFO/mytorch] --------------------------------------------------------------------------------
[2021-06-25/05:51/INFO/mytorch] worker env:
experiments_root: experiments
experiment_name: tat-wo-val_bs1_nbs3_rpointdir_s0.25_resunet3.16_penone.dirs.avg.seq+9+1+unet+5+2+16.single+mlpdir+mean+3+64+16_vgg
n_train_iters: -65536
seed: 42
train_batch_size: 1
train_batch_acc_steps: 1
eval_batch_size: 1
num_workers: 6
save_frequency: <co.mytorch.Frequency object at 0x7fd6472f4a50>
eval_frequency: <co.mytorch.Frequency object at 0x7fd64f6a8910>
train_device: cuda:0
eval_device: cuda:0
clip_gradient_value: None
clip_gradient_norm: None
empty_cache_per_batch: False
log_debug: []
train_iter_messages: []
stopwatch:
train_dsets: ['tat-wo-val']
eval_dsets: ['tat-subseq']
train_n_nbs: 3
train_src_mode: image
train_nbs_mode: argmax
train_scale: 0.25
eval_scale: 0.5
invalid_depth: 1000000000.0
point_aux_data: ['dirs']
point_edges_mode: penone
eval_n_max_sources: 5
train_rank_mode: pointdir
eval_rank_mode: pointdir
train_loss: VGGPerceptualLoss(
(vgg): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace=True)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU(inplace=True)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(6): ReLU(inplace=True)
(7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(8): ReLU(inplace=True)
(9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(13): ReLU(inplace=True)
(14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(15): ReLU(inplace=True)
(16): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(17): ReLU(inplace=True)
(18): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(19): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(20): ReLU(inplace=True)
(21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(22): ReLU(inplace=True)
(23): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(24): ReLU(inplace=True)
(25): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(26): ReLU(inplace=True)
(27): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(29): ReLU(inplace=True)
(30): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(31): ReLU(inplace=True)
(32): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(33): ReLU(inplace=True)
(34): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(35): ReLU(inplace=True)
(36): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
)
eval_loss: L1Loss()
exp_out_root: experiments/tat-wo-val_bs1_nbs3_rpointdir_s0.25_resunet3.16_penone.dirs.avg.seq+9+1+unet+5+2+16.single+mlpdir+mean+3+64+16_vgg
db_path: experiments/tat-wo-val_bs1_nbs3_rpointdir_s0.25_resunet3.16_penone.dirs.avg.seq+9+1+unet+5+2+16.single+mlpdir+mean+3+64+16_vgg/exp.ip-172-31-44-59.db
db_logger: <co.sqlite.Logger object at 0x7fd640347590>
[2021-06-25/05:51/INFO/mytorch] ================================================================================
[2021-06-25/05:51/INFO/exp] Create eval datasets
[2021-06-25/05:51/INFO/exp] create dataset for tat_subseq_training_Truck
[2021-06-25/05:51/INFO/dataset] #tgt_im_paths=25, #tgt_counts=(25, 226), tgt_im=(3, 576, 992), tgt_dm=(576, 992), train=False
[2021-06-25/05:51/INFO/exp] create dataset for tat_subseq_intermediate_M60
[2021-06-25/05:51/INFO/dataset] #tgt_im_paths=36, #tgt_counts=(36, 277), tgt_im=(3, 576, 1088), tgt_dm=(576, 1088), train=False
[2021-06-25/05:51/INFO/exp] create dataset for tat_subseq_intermediate_Playground
[2021-06-25/05:51/INFO/dataset] #tgt_im_paths=32, #tgt_counts=(32, 275), tgt_im=(3, 576, 1024), tgt_dm=(576, 1024), train=False
[2021-06-25/05:51/INFO/exp] create dataset for tat_subseq_intermediate_Train
[2021-06-25/05:51/INFO/dataset] #tgt_im_paths=43, #tgt_counts=(43, 258), tgt_im=(3, 576, 992), tgt_dm=(576, 992), train=False
[2021-06-25/05:51/INFO/modules] [NET][EncNet] resunet3.16
[2021-06-25/05:51/INFO/modules] [NET][RefNet] point_edges_mode=penone
[2021-06-25/05:51/INFO/modules] [NET][RefNet] point_aux_data=dirs
[2021-06-25/05:51/INFO/modules] [NET][RefNet] point_avg_mode=avg
[2021-06-25/05:51/INFO/modules] [NET][RefNet] Seq 9 nets, nets_residual=True
[2021-06-25/05:51/INFO/modules] [NET][RefNet] Unet(in_channels=16, enc_channels=[16, 32, 64, 128, 128], dec_channels=[128, 64, 32, 16], n_conv=2)
[2021-06-25/05:51/INFO/modules] [NET][RefNet] Unet(in_channels=16, enc_channels=[16, 32, 64, 128, 128], dec_channels=[128, 64, 32, 16], n_conv=2)
[2021-06-25/05:51/INFO/modules] [NET][RefNet] Unet(in_channels=16, enc_channels=[16, 32, 64, 128, 128], dec_channels=[128, 64, 32, 16], n_conv=2)
[2021-06-25/05:51/INFO/modules] [NET][RefNet] Unet(in_channels=16, enc_channels=[16, 32, 64, 128, 128], dec_channels=[128, 64, 32, 16], n_conv=2)
[2021-06-25/05:51/INFO/modules] [NET][RefNet] Unet(in_channels=16, enc_channels=[16, 32, 64, 128, 128], dec_channels=[128, 64, 32, 16], n_conv=2)
[2021-06-25/05:51/INFO/modules] [NET][RefNet] Unet(in_channels=16, enc_channels=[16, 32, 64, 128, 128], dec_channels=[128, 64, 32, 16], n_conv=2)
[2021-06-25/05:51/INFO/modules] [NET][RefNet] Unet(in_channels=16, enc_channels=[16, 32, 64, 128, 128], dec_channels=[128, 64, 32, 16], n_conv=2)
[2021-06-25/05:51/INFO/modules] [NET][RefNet] Unet(in_channels=16, enc_channels=[16, 32, 64, 128, 128], dec_channels=[128, 64, 32, 16], n_conv=2)
[2021-06-25/05:51/INFO/modules] [NET][RefNet] Unet(in_channels=16, enc_channels=[16, 32, 64, 128, 128], dec_channels=[128, 64, 32, 16], n_conv=2)
[2021-06-25/05:51/INFO/modules] [NET][RefNet] Single gnn
[2021-06-25/05:51/INFO/modules] [NET][RefNet] MLPDir(in_channels=16, hidden_channels=64, n_mods=3, out_channels=16, aggr=mean)
[2021-06-25/05:51/INFO/modules] [NET][RefNet] out_conv(16, 3)
[2021-06-25/05:51/INFO/mytorch] [EVAL] loading net for iter last: experiments/tat-wo-val_bs1_nbs3_rpointdir_s0.25_resunet3.16_penone.dirs.avg.seq+9+1+unet+5+2+16.single+mlpdir+mean+3+64+16_vgg/net_0000000000000000.params
[2021-06-25/05:51/INFO/mytorch]
[2021-06-25/05:51/INFO/mytorch] ================================================================================
[2021-06-25/05:51/INFO/mytorch] Evaluating set tat_subseq_training_Truck
[2021-06-25/05:51/INFO/exp] --------------------------------------------------------------------------------
[2021-06-25/05:51/INFO/mytorch] 2021-06-25 05:51:04
[2021-06-25/05:51/INFO/exp] Eval iter 0
[2021-06-25/05:51/INFO/exp] preprocess all source images
[2021-06-25/05:51/INFO/exp] feat tmp dir: experiments/tmp_srcfeat_tat-wo-val_bs1_nbs3_rpointdir_s0.25_resunet3.16_penone.dirs.avg.seq+9+1+unet+5+2+16.single+mlpdir+mean+3+64+16_vgg_tat_subseq_training_Truck
[2021-06-25/05:51/INFO/exp] create target images
invalid device function in /home/ubuntu/PreImage/StableViewSynthesis/ext/mytorch/include/common_cuda.h at 171
[1] 31933 segmentation fault (core dumped) python exp.py --net --cmd eval --iter last --eval-dsets tat-subseq
@MatveyMor
Did you solve the customized dataset issue?
I am facing the same problem.
There is no script for generating delaunay_photometric.ply
in create_data_own.py
.
Thank you very much for publishing "Stable View Synthesis", it seems to be the significant photorealistic approach for novel view synthesis! Could you add to your github page https://github.com/intel-isl/StableViewSynthesis the detailed instructions on how to build your own customized dataset, please?
Besides, I am interested in the following questions:
invalid configuration argument in /notebook/SVS/StableViewSynthesis/ext/mytorch/include/common_cuda.h at 171
What might be a reason for this?Thank you in advance!