nerfstudio-project / nerfstudio

A collaboration friendly studio for NeRFs
https://docs.nerf.studio
Apache License 2.0
9.55k stars 1.3k forks source link

Can't train instant-ngp with custom data #812

Closed ciglenecki closed 2 years ago

ciglenecki commented 2 years ago

I created my own data and tried to train instant-ngp with the following commands:

ns-process-data images --data input_dir/ --output-dir out_dir/
ns-train instant-ngp --data out_dir/

input_dir:

├── 2022-10-19-19-14-46-2373.jpg
├── 2022-10-19-19-14-48-2374.jpg
├── 2022-10-19-19-14-50-2375.jpg
└── 20221019_191451.jpg

The error I get:

[WARNING] Not running eval iterations since only viewer is enabled. Use `--vis wandb` or `--vis tensorboard` to run with
eval instead.
disabled tensorboard/wandb event writers
[13:22:53] Auto image downscale factor of 1                                                 nerfstudio_dataparser.py:197
           Skipping 0 files in dataset split train.                                          nerfstudio_dataparser.py:94
           Auto image downscale factor of 1                                                 nerfstudio_dataparser.py:197
           Skipping 0 files in dataset split val.                                            nerfstudio_dataparser.py:94
Loading data batch ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
None
/home/matej/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
  warnings.warn(
/home/matej/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing `weights=AlexNet_Weights.IMAGENET1K_V1`. You can also use `weights=AlexNet_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
Printing profiling stats, from longest to shortest duration in seconds
Traceback (most recent call last):
  File "/home/matej/anaconda3/envs/nerfstudio/bin/ns-train", line 8, in <module>
    sys.exit(entrypoint())
  File "/home/matej/projects/nerfstudio/scripts/train.py", line 248, in entrypoint
    main(
  File "/home/matej/projects/nerfstudio/scripts/train.py", line 234, in main
    launch(
  File "/home/matej/projects/nerfstudio/scripts/train.py", line 173, in launch
    main_func(local_rank=0, world_size=world_size, config=config)
  File "/home/matej/projects/nerfstudio/scripts/train.py", line 87, in train_loop
    trainer.setup()
  File "/home/matej/projects/nerfstudio/nerfstudio/engine/trainer.py", line 111, in setup
    self.pipeline = self.config.pipeline.setup(
  File "/home/matej/projects/nerfstudio/nerfstudio/configs/base_config.py", line 63, in setup
    return self._target(self, **kwargs)
  File "/home/matej/projects/nerfstudio/nerfstudio/pipelines/dynamic_batch.py", line 62, in __init__
    self._update_pixel_samplers()
  File "/home/matej/projects/nerfstudio/nerfstudio/pipelines/dynamic_batch.py", line 67, in _update_pixel_samplers
    self.datamanager.eval_pixel_sampler.set_num_rays_per_batch(self.dynamic_num_rays_per_batch)
  File "/home/matej/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1207, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'VanillaDataManager' object has no attribute 'eval_pixel_sampler'

Config context:

Config(
    output_dir=PosixPath('outputs'),
    method_name='instant-ngp',
    experiment_name=None,
    timestamp='2022-10-20_132428',
    machine=MachineConfig(seed=42, num_gpus=1, num_machines=1, machine_rank=0, dist_url='auto'),
    logging=LoggingConfig(
        relative_log_dir=PosixPath('.'),
        steps_per_log=10,
        max_buffer_size=20,
        local_writer=LocalWriterConfig(
            _target=<class 'nerfstudio.utils.writer.LocalWriter'>,
            enable=True,
            stats_to_track=(
                <EventName.ITER_TRAIN_TIME: 'Train Iter (time)'>,
                <EventName.TRAIN_RAYS_PER_SEC: 'Train Rays / Sec'>,
                <EventName.CURR_TEST_PSNR: 'Test PSNR'>,
                <EventName.VIS_RAYS_PER_SEC: 'Vis Rays / Sec'>,
                <EventName.TEST_RAYS_PER_SEC: 'Test Rays / Sec'>
            ),
            max_log_size=10
        ),
        enable_profiler=True
    ),
    viewer=ViewerConfig(
        relative_log_filename='viewer_log_filename.txt',
        start_train=True,
        zmq_port=None,
        launch_bridge_server=True,
        websocket_port=7007,
        ip_address='127.0.0.1',
        num_rays_per_chunk=64000
    ),
    trainer=TrainerConfig(
        steps_per_save=2000,
        steps_per_eval_batch=500,
        steps_per_eval_image=500,
        steps_per_eval_all_images=25000,
        max_num_iterations=30000,
        mixed_precision=True,
        relative_model_dir=PosixPath('nerfstudio_models'),
        save_only_latest_checkpoint=True,
        load_dir=None,
        load_step=None,
        load_config=None
    ),
    pipeline=DynamicBatchPipelineConfig(
        _target=<class 'nerfstudio.pipelines.dynamic_batch.DynamicBatchPipeline'>,
        datamanager=VanillaDataManagerConfig(
            _target=<class 'nerfstudio.data.datamanagers.VanillaDataManager'>,
            dataparser=NerfstudioDataParserConfig(
                _target=<class 'nerfstudio.data.dataparsers.nerfstudio_dataparser.Nerfstudio'>,
                data=PosixPath('out_dir'),
                scale_factor=1.0,
                downscale_factor=None,
                scene_scale=1.0,
                orientation_method='up',
                center_poses=True,
                auto_scale_poses=True,
                train_split_percentage=0.9
            ),
            train_num_rays_per_batch=8192,
            train_num_images_to_sample_from=-1,
            eval_num_rays_per_batch=1024,
            eval_num_images_to_sample_from=-1,
            eval_image_indices=(0,),
            camera_optimizer=CameraOptimizerConfig(
                _target=<class 'nerfstudio.cameras.camera_optimizers.CameraOptimizer'>,
                mode='off',
                position_noise_std=0.0,
                orientation_noise_std=0.0,
                optimizer=AdamOptimizerConfig(
                    _target=<class 'torch.optim.adam.Adam'>,
                    lr=0.0006,
                    eps=1e-15,
                    weight_decay=0
                ),
                scheduler=SchedulerConfig(
                    _target=<class 'nerfstudio.engine.schedulers.ExponentialDecaySchedule'>,
                    lr_final=5e-06,
                    max_steps=10000
                ),
                param_group='camera_opt'
            )
        ),
        model=InstantNGPModelConfig(
            _target=<class 'nerfstudio.models.instant_ngp.NGPModel'>,
            enable_collider=False,
            collider_params=None,
            loss_coefficients={'rgb_loss_coarse': 1.0, 'rgb_loss_fine': 1.0},
            eval_num_rays_per_chunk=8192,
            max_num_samples_per_ray=24,
            grid_resolution=128,
            contraction_type=<ContractionType.UN_BOUNDED_SPHERE: 2>,
            cone_angle=0.004,
            render_step_size=0.01,
            near_plane=0.05,
            far_plane=1000.0,
            use_appearance_embedding=False,
            randomize_background=True
        ),
        target_num_samples=262144,
        max_num_samples_per_ray=1024
    ),
    optimizers={
        'fields': {
            'optimizer': AdamOptimizerConfig(
                _target=<class 'torch.optim.adam.Adam'>,
                lr=0.01,
                eps=1e-15,
                weight_decay=0
            ),
            'scheduler': None
        }
    },
    vis='viewer',
    data=PosixPath('out_dir')
)
tancik commented 2 years ago

In your out_dir/ there should be a transforms.json. If you take a look at that file, it should have a list of all the images, how many do you see?

ciglenecki commented 2 years ago

There are 10 elements (frames). There were 10 images in the input_dir so I'm guessing the transforms.json is fine.

transforms.json:

{
    "fl_x": 1576.078390249128,
    "fl_y": 1573.7989096363758,
    "cx": 981.3623899513723,
    "cy": 770.0756596109569,
    "w": 2000,
    "h": 1500,
    "camera_model": "OPENCV",
    "k1": 0.02908676582308789,
    "k2": -0.047658955250196165,
    "p1": 0.0010081556988369774,
    "p2": 0.0011319306469034032,
    "frames": [
        {
            "file_path": "images/frame_00010.jpg",
            "transform_matrix": [
                [
                    -0.8166114393288079,
                    0.41892523514049446,
                    -0.39704836546674577,
                    -2.4375610219165464
                ],
                [
                    -0.14091166558603932,
                    -0.8117852115958311,
                    -0.5666998083077851,
                    -4.779443261927901
                ],
                [
                    -0.5597228418236129,
                    -0.40682479963344625,
                    0.7219445427067824,
                    -0.28845631170363795
                ],
                [
                    0.0,
                    0.0,
                    0.0,
                    1.0
                ]
            ]
    },
    ... # 9 more elements here
    ]
}
tancik commented 2 years ago

Which version of nerfstudio are you running? Can you try running with the latest version.

ciglenecki commented 2 years ago

I just pulled the new version (0.1.6) and ran the same commands. I got a different error AttributeError: 'NoneType' object has no attribute 'ContractionType'

edit: please ignore the issue for now, it seems that I didn't properly install the CUDA toolkit ("NerfAcc: No CUDA toolkit found. NerfAcc will be disabled")

log:

``` (nerfstudio) matej@doom:~/projects/nerfstudio$ ns-process-data images --data input_dir/ --output-dir out_dir [01:25:14] 🎉 Done copying images. process_data.py:283 🎉 Done downscaling images. process_data.py:322 [01:25:15] 🎉 Done extracting COLMAP features. process_data.py:363 🎉 Done matching COLMAP features. process_data.py:377 [01:25:19] 🎉 Done COLMAP bundle adjustment. process_data.py:399 🎉 Done refining intrinsics. process_data.py:408 ────────────────────────────────────────────── 🎉 🎉 🎉 All DONE 🎉 🎉 🎉 ────────────────────────────────────────────── Starting with 10 images We downsampled the images by 2x, 4x and 8x Colmap matched 10 images ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── (nerfstudio) matej@doom:~/projects/nerfstudio$ ns-train instant-ngp --data out_dir/ [01:25:25] Using --data alias for --data.pipeline.datamanager.dataparser.data train.py:223 ──────────────────────────────────────────────────────── Config ──────────────────────────────────────────────────────── Config( output_dir=PosixPath('outputs'), method_name='instant-ngp', experiment_name=None, timestamp='2022-10-22_012525', machine=MachineConfig(seed=42, num_gpus=1, num_machines=1, machine_rank=0, dist_url='auto'), logging=LoggingConfig( relative_log_dir=PosixPath('.'), steps_per_log=10, max_buffer_size=20, local_writer=LocalWriterConfig( _target=, enable=True, stats_to_track=( , , , , ), max_log_size=10 ), enable_profiler=True ), viewer=ViewerConfig( relative_log_filename='viewer_log_filename.txt', start_train=True, zmq_port=None, launch_bridge_server=True, websocket_port=7007, ip_address='127.0.0.1', num_rays_per_chunk=64000 ), trainer=TrainerConfig( steps_per_save=2000, steps_per_eval_batch=500, steps_per_eval_image=500, steps_per_eval_all_images=25000, max_num_iterations=30000, mixed_precision=True, relative_model_dir=PosixPath('nerfstudio_models'), save_only_latest_checkpoint=True, load_dir=None, load_step=None, load_config=None ), pipeline=DynamicBatchPipelineConfig( _target=, datamanager=VanillaDataManagerConfig( _target=, dataparser=NerfstudioDataParserConfig( _target=, data=PosixPath('out_dir'), scale_factor=1.0, downscale_factor=None, scene_scale=1.0, orientation_method='up', center_poses=True, auto_scale_poses=True, train_split_percentage=0.9 ), train_num_rays_per_batch=8192, train_num_images_to_sample_from=-1, eval_num_rays_per_batch=1024, eval_num_images_to_sample_from=-1, eval_image_indices=(0,), camera_optimizer=CameraOptimizerConfig( _target=, mode='off', position_noise_std=0.0, orientation_noise_std=0.0, optimizer=AdamOptimizerConfig( _target=, lr=0.0006, eps=1e-15, weight_decay=0 ), scheduler=SchedulerConfig( _target=, lr_final=5e-06, max_steps=10000 ), param_group='camera_opt' ) ), model=InstantNGPModelConfig( _target=, enable_collider=False, collider_params=None, loss_coefficients={'rgb_loss_coarse': 1.0, 'rgb_loss_fine': 1.0}, eval_num_rays_per_chunk=8192, max_num_samples_per_ray=24, grid_resolution=128, contraction_type=, cone_angle=0.004, render_step_size=0.01, near_plane=0.05, far_plane=1000.0, use_appearance_embedding=False, randomize_background=True ), target_num_samples=262144, max_num_samples_per_ray=1024 ), optimizers={ 'fields': { 'optimizer': AdamOptimizerConfig( _target=, lr=0.01, eps=1e-15, weight_decay=0 ), 'scheduler': None } }, vis='viewer', data=PosixPath('out_dir') ) ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── [01:25:25] Saving config to: outputs/out_dir/instant-ngp/2022-10-22_012525/config.yml base_config.py:266 [01:25:25] Saving checkpoints to: outputs/out_dir/instant-ngp/2022-10-22_012525/nerfstudio_models trainer.py:89 Using ZMQ port: 52861 ======================================================================================================================== [Public] Open the viewer at https://viewer.nerf.studio/versions/22-10-13-0/?websocket_url=ws://localhost:7007 ======================================================================================================================== Sending ping to the viewer Bridge Server... Successfully connected. Sending ping to the viewer Bridge Server... Successfully connected. [WARNING] Not running eval iterations since only viewer is enabled. Use `--vis wandb` or `--vis tensorboard` to run with eval instead. disabled tensorboard/wandb event writers [01:25:25] Auto image downscale factor of 2 nerfstudio_dataparser.py:197 Skipping 0 files in dataset split train. nerfstudio_dataparser.py:94 Auto image downscale factor of 2 nerfstudio_dataparser.py:197 Skipping 0 files in dataset split val. nerfstudio_dataparser.py:94 Loading data batch ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 Loading data batch ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 None /home/matej/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead. warnings.warn( /home/matej/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing `weights=AlexNet_Weights.IMAGENET1K_V1`. You can also use `weights=AlexNet_Weights.DEFAULT` to get the most up-to-date weights. warnings.warn(msg) No checkpoints to load, training from scratch NerfAcc: No CUDA toolkit found. NerfAcc will be disabled. Printing profiling stats, from longest to shortest duration in seconds Traceback (most recent call last): File "/home/matej/anaconda3/envs/nerfstudio/bin/ns-train", line 8, in sys.exit(entrypoint()) File "/home/matej/projects/nerfstudio/scripts/train.py", line 248, in entrypoint main( File "/home/matej/projects/nerfstudio/scripts/train.py", line 234, in main launch( File "/home/matej/projects/nerfstudio/scripts/train.py", line 173, in launch main_func(local_rank=0, world_size=world_size, config=config) File "/home/matej/projects/nerfstudio/scripts/train.py", line 88, in train_loop trainer.train() File "/home/matej/projects/nerfstudio/nerfstudio/engine/trainer.py", line 141, in train callback.run_callback_at_location( File "/home/matej/projects/nerfstudio/nerfstudio/engine/callbacks.py", line 106, in run_callback_at_location self.run_callback(step=step) File "/home/matej/projects/nerfstudio/nerfstudio/engine/callbacks.py", line 93, in run_callback self.func(*self.args, **self.kwargs, step=step) File "/home/matej/projects/nerfstudio/nerfstudio/models/instant_ngp.py", line 144, in update_occupancy_grid self.occupancy_grid.every_n_step( File "/home/matej/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/home/matej/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/nerfacc/grid.py", line 271, in every_n_step self._update( File "/home/matej/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/home/matej/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/nerfacc/grid.py", line 224, in _update x = contract_inv( File "/home/matej/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/home/matej/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/nerfacc/contraction.py", line 101, in contract_inv ctype = type.to_cpp_version() File "/home/matej/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/nerfacc/contraction.py", line 62, in to_cpp_version return _C.ContractionTypeGetter(self.value) File "/home/matej/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/nerfacc/cuda/__init__.py", line 13, in call_cuda return getattr(_C, name)(*args, **kwargs) AttributeError: 'NoneType' object has no attribute 'ContractionType'