The above message is followed by a RuntimeError message in the terminal, which is very lengthy, but the following prompt appears in the last part:
extern "C"
__launch_bounds(512, 4)
global__ void reduction_prod_kernel(ReduceJitOp r){
r.run();
}
nvrtc: error: invalid value for --gpu-architecture (-arch)
I print the values of _xyzmax and _xyzmin to see what the problem is, but I find that they are computable, so I can't find what the problem is. Can you explain what's happening here, please? BTW, my environment configuration is pytorch 1.12.0 + CUDA11.3 and GPU is NVIDIA 4090. Any advice would be appreciated!
This bug is solved by the correct version of CUDA as well as torch. The project needs to run strictly according to the version stated by the author, my previous version was relatively low.
When I entered the command, the message from the terminal was as follows:
Namespace(L1_weight=0.01, N_voxel_final=262144000, N_voxel_init=262144, TV_weight_app=0.0, TV_weight_density=0.0, add_frames_every=100, alpha_mask_thre=0.0001, batch_size=4096, ckpt=None, config=None, data_dim_color=27, datadir='/data/liuchen/localrf/data/katwijk', density_shift=-5, device='cuda:0', distance_scale=25, downsampling=-1, fea2denseAct='softplus', fea_pe=0, featureC=128, fov=66.0, frame_step=1, logdir='/data/liuchen/localrf/log/katwijk', loss_depth_weight_inital=0.1, loss_flow_weight_inital=1, lr_R_init=0.005, lr_basis=0.001, lr_decay_target_ratio=0.1, lr_exposure_init=0.001, lr_i_init=0, lr_init=0.02, lr_t_init=0.0005, lr_upsample_reset=1, max_drift=1, model_name='TensorVMSplit', nSamples=1000000.0, n_init_frames=5, n_iters_per_frame=600, n_iters_reg=100, n_lamb_sh=[24, 24, 24], n_lamb_sigma=[8, 8, 8], n_max_frames=100, n_overlap=30, pos_pe=0, prog_speedup_factor=1.0, progress_refresh_rate=200, refinement_speedup_factor=1.0, render_from_file='', render_only=0, render_path=1, render_test=1, rm_weight_mask_thre=0.001, shadingMode='MLP_Fea_late_view', skip_TB_images=False, skip_saving_video=False, step_ratio=0.5, subsequence=[0, -1], test_frame_every=10, update_AlphaMask_list=[100, 200, 300], upsamp_list=[100, 150, 200, 250, 300], view_pe=0, vis_every=10000, with_preprocessed_poses=0) lc_min: tensor([-2., -2., -2.], device='cuda:0') lc_max: tensor([2., 2., 2.], device='cuda:0') n_novels: 262144 xyz_max: tensor([2., 2., 2.], device='cuda:0') xyz_min: tensor([-2., -2., -2.], device='cuda:0') n_voxels: 262144 Traceback (most recent call last): File "localTensoRF/train.py", line 661, in
reconstruction(args)
File "localTensoRF/train.py", line 265, in reconstruction
reso_cur = N_to_reso(args.N_voxel_init, aabb)
File "/data/liuchen/localrf/localTensoRF/utils/utils.py", line 205, in N_to_reso
voxel_size = ((xyz_max - xyz_min).prod() / n_voxels).pow(1 / 3)
The above message is followed by a RuntimeError message in the terminal, which is very lengthy, but the following prompt appears in the last part:
extern "C" __launch_bounds(512, 4) global__ void reduction_prod_kernel(ReduceJitOp r){ r.run(); } nvrtc: error: invalid value for --gpu-architecture (-arch)
I print the values of _xyzmax and _xyzmin to see what the problem is, but I find that they are computable, so I can't find what the problem is. Can you explain what's happening here, please? BTW, my environment configuration is pytorch 1.12.0 + CUDA11.3 and GPU is NVIDIA 4090. Any advice would be appreciated!