Closed Jing1Ling closed 5 months ago
I simply tested the difference between ’cam_dirs‘ and ’directions‘ on the garden scene. Each method trained two models based on nerfstudio, and the psnr difference of the validation set was within 0.1.
ns-train zipnerf --data /SSD_DISK/datasets/360_v2/bicycle/
, leading to AssertionError: Colmap path /SSD_DISK/datasets/360_v2/bicycle/colmap/sparse/0 does not exist.
, and maybe additional argument need to add for that?Hi,when I start training, it appears to be index out of bounds
, which comes from here. I find that the ray_indices[:,1].max()==image_height, which is wrong, it should be image_height-1, so does the width. I'm not sure how to fix this.
I mentioned this situation in README.md:
*Nerfstudio's ColmapDataParser rounds down the image size when downscaling, which is different from the 360_v2 dataset.You can use nerfstudio to reprocess the data or modify the code logic for downscale in the library as dicussed in https://github.com/nerfstudio-project/nerfstudio/issues/1438.
Fastest Solution change the two line here to:
self.height = torch.floor(0.5 + (self.height * scaling_factor)).to(torch.int64)
self.width = torch.floor(0.5 + (self.width * scaling_factor)).to(torch.int64)
Excuse me, I followed the above method, but I ran into an unsolvable problem when executing ns-train zipnerf --data bicycle colmap --colmap-path sparse/0
:
RuntimeError: CUDA error: device-side assert triggered
Is there any good way to solve it
Excuse me, I followed the above method, but I ran into an unsolvable problem when executing
ns-train zipnerf --data bicycle colmap --colmap-path sparse/0
:RuntimeError: CUDA error: device-side assert triggered
Is there any good way to solve it
Hi @Pioneer6gun9! Someone reminded me that the rounding strategy of mipnerf360 is not ceil but round. I've updated the code above. Btw, I've submit a pull request for nerfstudio for this issue. I'm not sure if this is the reason, feel free to contact me if you still have any questions.
Thank you for your reply. I will try it again, it may be my side of the problem.
xxx @.***
------------------ 原始邮件 ------------------ 发件人: "Ling @.>; 发送时间: 2024年4月8日(星期一) 下午4:53 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [SuLvXiangXin/zipnerf-pytorch] add support for nerfstudio (PR #98)
Excuse me, I followed the above method, but I ran into an unsolvable problem when executing ns-train zipnerf --data bicycle colmap --colmap-path sparse/0 : RuntimeError: CUDA error: device-side assert triggered Is there any good way to solve it
Hi @Pioneer6gun9! Someone reminded me that the rounding strategy of mipnerf360 is not ceil but round. I've updated the code above. Btw, I've submit a pull request for nerfstudio for this issue. I'm not sure if this is the reason, feel free to contact me if you still have any questions.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
I mentioned this situation in README.md:
*Nerfstudio's ColmapDataParser rounds down the image size when downscaling, which is different from the 360_v2 dataset.You can use nerfstudio to reprocess the data or modify the code logic for downscale in the library as dicussed in nerfstudio-project/nerfstudio#1438.
Fastest Solution change the two line here to:
self.height = torch.floor(0.5 + (self.height * scaling_factor)).to(torch.int64) self.width = torch.floor(0.5 + (self.width * scaling_factor)).to(torch.int64)
I modify the codes here, and resolve the problem:
dataparser=ColmapDataParserConfig(downscale_factor=4,orientation_method="up",center_method="poses", colmap_path="sparse/0"),
to
dataparser=ColmapDataParserConfig(downscale_factor=4,orientation_method="up",center_method="poses", colmap_path="sparse/0", downscale_rounding_mode="round"),
Hi @unanan, you're right. Now that the PR submitted to nerfstudio about rounding mode has been merged. I will submit a PR to update the readme of this repo later.
You can also specify the rounding mode when entering training instruction:
ns-train zipnerf --data path/to/data colmap --downscale_rounding_mode round
@Jing1Ling Hello mate, Do you have any solutions to resolve this issue?:
NameError: name 'segment_coo' is not defined
The entire error info is:
(nerfstudio) E:\zipnerf-pytorch>ns-train zipnerf --data ./data/flowers colmap --colmap-path sparse/0 E:\Programming\Anaconda\envs\nerfstudio\lib\site-packages\tyro_fields.py:343: UserWarning: The field colmap_path is annotated with type <class 'pathlib.Path'>, but the default value sparse/0 has type <class 'str'>. We'll try to handle this gracefully, but it may cause unexpected behavior. warnings.warn( [03:08:49] Using --data alias for --data.pipeline.datamanager.data train.py:230 ──────────────────────────────────────────────────────── Config ──────────────────────────────────────────────────────── TrainerConfig( _target=<class 'nerfstudio.engine.trainer.Trainer'>, output_dir=WindowsPath('outputs'), method_name='zipnerf', experiment_name=None, project_name='nerfstudio-project', timestamp='2024-05-28_030849', machine=MachineConfig(seed=42, num_devices=1, num_machines=1, machine_rank=0, dist_url='auto', device_type='cuda'), logging=LoggingConfig( relative_log_dir=WindowsPath('.'), steps_per_log=10, max_buffer_size=20, local_writer=LocalWriterConfig( _target=<class 'nerfstudio.utils.writer.LocalWriter'>, enable=True, stats_to_track=( <EventName.ITER_TRAIN_TIME: 'Train Iter (time)'>, <EventName.TRAIN_RAYS_PER_SEC: 'Train Rays / Sec'>, <EventName.CURR_TEST_PSNR: 'Test PSNR'>, <EventName.VIS_RAYS_PER_SEC: 'Vis Rays / Sec'>, <EventName.TEST_RAYS_PER_SEC: 'Test Rays / Sec'>, <EventName.ETA: 'ETA (time)'> ), max_log_size=10 ), profiler='basic' ), viewer=ViewerConfig( relative_log_filename='viewer_log_filename.txt', websocket_port=None, websocket_port_default=7007, websocket_host='0.0.0.0', num_rays_per_chunk=32768, max_num_display_images=512, quit_on_train_completion=False, image_format='jpeg', jpeg_quality=75, make_share_url=False, camera_frustum_scale=0.1, default_composite_depth=True ), pipeline=ZipNerfPipelineConfig( _target=<class 'zipnerf_ns.zipnerf_pipeline.ZipNerfPipeline'>, datamanager=ZipNerfDataManagerConfig( _target=<class 'zipnerf_ns.zipnerf_datamanager.ZipNerfDataManager'>, data=WindowsPath('data/flowers'), masks_on_gpu=False, images_on_gpu=False, dataparser=ColmapDataParserConfig( _target=<class 'nerfstudio.data.dataparsers.colmap_dataparser.ColmapDataParser'>, data=WindowsPath('.'), scale_factor=1.0, downscale_factor=4, downscale_rounding_mode='round', scene_scale=1.0, orientation_method='up', center_method='poses', auto_scale_poses=True, assume_colmap_world_coordinate_convention=True, eval_mode='interval', train_split_fraction=0.9, eval_interval=8, depth_unit_scale_factor=0.001, images_path=WindowsPath('images'), masks_path=None, depths_path=None, colmap_path=WindowsPath('sparse/0'), load_3D_points=True, max_2D_matches_per_3D_point=0 ), train_num_rays_per_batch=8192, train_num_images_to_sample_from=-1, train_num_times_to_repeat_images=-1, eval_num_rays_per_batch=8192, eval_num_images_to_sample_from=-1, eval_num_times_to_repeat_images=-1, eval_image_indices=(0,), collate_fn=<function nerfstudio_collate at 0x000001E2B32A3C10>, camera_res_scale_factor=1.0, patch_size=1, camera_optimizer=None, pixel_sampler=PixelSamplerConfig( _target=<class 'nerfstudio.data.pixel_samplers.PixelSampler'>, num_rays_per_batch=4096, keep_full_image=False, is_equirectangular=False, ignore_mask=False, fisheye_crop_radius=None, rejection_sample_mask=True, max_num_iterations=100 ) ), model=ZipNerfModelConfig( _target=<class 'zipnerf_ns.zipnerf_model.ZipNerfModel'>, enable_collider=True, collider_params={'near_plane': 2.0, 'far_plane': 6.0}, loss_coefficients={'rgb_loss_coarse': 1.0, 'rgb_loss_fine': 1.0}, eval_num_rays_per_chunk=32768, prompt=None, gin_file=['configs/360.gin'], compute_extras=True, proposal_weights_anneal_max_num_iters=1000, rand=True, zero_glo=False ) ), optimizers={ 'model': { 'optimizer': AdamOptimizerConfig( _target=<class 'torch.optim.adam.Adam'>, lr=0.008, eps=1e-15, max_norm=None, weight_decay=0 ), 'scheduler': ExponentialDecaySchedulerConfig( _target=<class 'nerfstudio.engine.schedulers.ExponentialDecayScheduler'>, lr_pre_warmup=1e-08, lr_final=0.001, warmup_steps=1000, max_steps=25000, ramp='cosine' ) } }, vis='viewer', data=WindowsPath('data/flowers'), prompt=None, relative_model_dir=WindowsPath('nerfstudio_models'), load_scheduler=True, steps_per_save=5000, steps_per_eval_batch=1000, steps_per_eval_image=5000, steps_per_eval_all_images=25000, max_num_iterations=25000, mixed_precision=True, use_grad_scaler=False, save_only_latest_checkpoint=True, load_dir=None, load_step=None, load_config=None, load_checkpoint=None, log_gradients=False, gradient_accumulation_steps={} ) ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Saving config to: outputs\flowers\zipnerf\2024-05-28_030849\config.yml experiment_config.py:136 Saving checkpoints to: outputs\flowers\zipnerf\2024-05-28_030849\nerfstudio_models trainer.py:137 Setting up training dataset... Caching all 151 images. Setting up evaluation dataset... Caching all 22 images. E:\Programming\Anaconda\envs\nerfstudio\lib\site-packages\torchmetrics\utilities\prints.py:62: FutureWarning: Importing
PeakSignalNoiseRatio
fromtorchmetrics
was deprecated and will be removed in 2.0. ImportPeakSignalNoiseRatio
fromtorchmetrics.image
instead. _future_warning( ╭─────────────── viser ───────────────╮ │ ╷ │ │ HTTP │ http://0.0.0.0:7007 │ │ Websocket │ ws://0.0.0.0:7007 │ │ ╵ │ ╰─────────────────────────────────────╯ [NOTE] Not running eval iterations since only viewer is enabled. Use --vis {wandb, tensorboard, viewer+wandb, viewer+tensorboard} to run with eval. No Nerfstudio checkpoint to load, so training from scratch. Disabled comet/tensorboard/wandb event writers Printing profiling stats, from longest to shortest duration in seconds VanillaPipeline.get_train_loss_dict: 0.2297 Trainer.train_iteration: 0.2297 Traceback (most recent call last): File "E:\Programming\Anaconda\envs\nerfstudio\lib\runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "E:\Programming\Anaconda\envs\nerfstudio\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "E:\Programming\Anaconda\envs\nerfstudio\Scripts\ns-train.exe__main__.py", line 7, inFile "E:\Programming\Anaconda\envs\nerfstudio\lib\site-packages\nerfstudio\scripts\train.py", line 262, in entrypoint main( File "E:\Programming\Anaconda\envs\nerfstudio\lib\site-packages\nerfstudio\scripts\train.py", line 247, in main launch( File "E:\Programming\Anaconda\envs\nerfstudio\lib\site-packages\nerfstudio\scripts\train.py", line 189, in launch main_func(local_rank=0, world_size=world_size, config=config) File "E:\Programming\Anaconda\envs\nerfstudio\lib\site-packages\nerfstudio\scripts\train.py", line 100, in train_loop trainer.train() File "E:\Programming\Anaconda\envs\nerfstudio\lib\site-packages\nerfstudio\engine\trainer.py", line 261, in train loss, loss_dict, metrics_dict = self.train_iteration(step) File "E:\Programming\Anaconda\envs\nerfstudio\lib\site-packages\nerfstudio\utils\profiler.py", line 112, in inner out = func(*args, kwargs) File "E:\Programming\Anaconda\envs\nerfstudio\lib\site-packages\nerfstudio\engine\trainer.py", line 496, in trainiteration , loss_dict, metrics_dict = self.pipeline.get_train_loss_dict(step=step) File "E:\Programming\Anaconda\envs\nerfstudio\lib\site-packages\nerfstudio\utils\profiler.py", line 112, in inner out = func(*args, *kwargs) File "E:\Programming\Anaconda\envs\nerfstudio\lib\site-packages\nerfstudio\pipelines\base_pipeline.py", line 301, in get_train_loss_dict model_outputs = self._model(ray_bundle) # train distributed data parallel model if world_size > 1 File "E:\Programming\Anaconda\envs\nerfstudio\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "E:\Programming\Anaconda\envs\nerfstudio\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "E:\Programming\Anaconda\envs\nerfstudio\lib\site-packages\nerfstudio\models\base_model.py", line 143, in forward return self.get_outputs(ray_bundle) File "E:\zipnerf-pytorch\zipnerf_ns\zipnerf_model.py", line 94, in get_outputs renderings, ray_history = self.zipnerf( File "E:\Programming\Anaconda\envs\nerfstudio\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "E:\Programming\Anaconda\envs\nerfstudio\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "E:\zipnerf-pytorch\internal\models.py", line 307, in forward loss_hash_decay = segment_coo(param ** 2, NameError: name 'segment_coo' is not defined
Hi @s1eeveW ! 'segment_coo‘ is a function in pytorch_scatter package. You can install pytorch_scatter in your python envirionment. Also, you can simply comment these lines and use this line to calculate 'loss_hash_decay'. They only have little difference and I think the replacement won't effect much thing.
According to the template provided by nerfstudio, several related files have been added. • 'zipnerf_config.py': parameters configuration. • 'zipnerf_model.py': use a model wrapper to reuse the Model class in 'internal/models.py'.
Replace 'cam_dirs' with 'directions' as dicussed in this issue. You can use some tools provided by nerfstudio (e.g. viewer) with this patch. Except for the modification of cast_ray(), the original content will not be affected. This is also because camera directions are not provided in Nerfstudio's input data which named RayBundle.