Hello,
I've installed Kaolin-wisp a few days ago, on Ubuntu 20.04 (since it crashes during compilation on Windows). I followed exactly the Quickstart installation (I literally copy-pasted the content). I ran across several problems, which I managed to solve, but now I have been stuck for a day on a non-recurrent error, which almost never happens exactly at the same iteration, but always on the same line of code. This error only happens when I try to launch the app with the GUI, otherwise everything works fine (however it is quite slow, needing 20-30s for validating one FHD image). Note that even when the GUI is successfully launched, the refresh rate is over a second, and nothing in the GUI responds (I can click on buttons or change values in fields, but nothing happens afterwards). Here is the console log :
$ __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia python app/nerf/main_nerf.py --dataset-path training_data/fox --config app/nerf/configs/nerf_hash.yaml
apex import failed. apex optimizer will not be available
blas
constructor: OctreeAS.make_dense
level: 7
grid
constructor: HashGrid.from_geometric
feature_dim: 2
num_lods: 16
multiscale_type: cat
feature_std: 1e-09
feature_bias: 0.0
codebook_bitwidth: 19
min_grid_res: 16
max_grid_res: 1024
nef
constructor: NeuralRadianceField
pos_embedder: none
view_embedder: positional
pos_multires: 10
view_multires: 4
position_input: False
activation_type: relu
layer_type: linear
hidden_dim: 64
num_layers: 1
bias: True
prune_density_decay: 0.95
prune_min_density: 2.956033378250884
tracer
constructor: PackedRFTracer
raymarch_type: uniform
num_steps: 512
step_size: 1.0
bg_color: (0.0, 0.0, 0.0)
dataset
constructor: NeRFSyntheticDataset
dataset_path: training_data/fox
split: train
bg_color: (0.0, 0.0, 0.0)
mip: 0
dataset_num_workers: -1
transform: None
dataset_transform
constructor: SampleRays
num_samples: 4096
trainer
optimizer
constructor: Adam
lr: 0.001
betas: (0.9, 0.999)
eps: 1e-16
weight_decay: 1e-06
dataloader
batch_size: 1
num_workers: 0
exp_name: nerf-hash
mode: train
max_epochs: 10
save_every: -1
save_as_new: False
model_format: full
render_every: -1
valid_every: -1
valid_split: test
enable_amp: True
profile_nvtx: False
grid_lr_weight: 500.0
scheduler: True
scheduler_milestones: (0.5, 0.75, 0.9)
scheduler_gamma: 0.333
valid_metrics: ('psnr',)
start_prune: 1000
prune_every: 100
random_lod: False
rgb_lambda: 1.0
opacity_loss: 0.0
rgb_loss_type: huber
rgb_loss_denom: rays
target_sample_size: 262144
save_valid_imgs: False
tracker
tensorboard
constructor: _Tensorboard
log_dir: _results/logs/runs
exp_name: None
log_fname: None
wandb
constructor: _WandB
entity: None
project: wisp-nerf
group: None
run_name: None
job_type: train
sync_tensorboard: True
visualizer
constructor: OfflineRenderer
render_res: (1024, 1024)
render_batch: 10000
shading_mode: rb
matcap_path: ./data/matcap/Pearl.png
shadow: False
ao: False
perf: False
vis_camera
camera_origin: (-3.0, 0.65, -3.0)
camera_lookat: (0.0, 0.0, 0.0)
camera_fov: 30.0
camera_clamp: (0.0, 10.0)
viz360_num_angles: 20
viz360_radius: 3.0
viz360_render_all_lods: False
enable_tensorboard: True
enable_wandb: False
log_dir: _results/logs/runs
log_level: 20
pretrained: None
device: cuda
interactive: True
loading data: 100%|████████████████████████████| 33/33 [00:00<00:00, 272.54it/s]
2023-09-25 16:51:45,977| INFO| WARNING: The dataset expects distortion correction, but the current implementation does not handle this.
/home/greau-hamard/anaconda3/envs/wisp/lib/python3.9/site-packages/torch/functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2894.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
2023-09-25 16:51:48,247| INFO| Using NVIDIA RTX A4000 Laptop GPU with CUDA v11.3
2023-09-25 16:51:48,247| INFO| Total number of parameters: 11431941
[i] Using PYGLFW_IMGUI (GL 3.3)
2023-09-25 16:51:48,638| INFO| [i] Using PYGLFW_IMGUI (GL 3.3)
[i] Running at 60 frames/second
2023-09-25 16:51:48,654| INFO| [i] Running at 60 frames/second
Traceback (most recent call last):
File "/home/greau-hamard/Téléchargements/kaolin-wisp/app/nerf/main_nerf.py", line 133, in <module>
app.run() # Run in interactive mode
File "/home/greau-hamard/Téléchargements/kaolin-wisp/wisp/renderer/app/wisp_app.py", line 267, in run
app.run() # App clock should always run as frequently as possible (background tasks should not be limited)
File "/home/greau-hamard/anaconda3/envs/wisp/lib/python3.9/site-packages/glumpy/app/__init__.py", line 362, in run
run(duration, framecount)
File "/home/greau-hamard/anaconda3/envs/wisp/lib/python3.9/site-packages/glumpy/app/__init__.py", line 344, in run
count = __backend__.process(dt)
File "/home/greau-hamard/anaconda3/envs/wisp/lib/python3.9/site-packages/glumpy/app/window/backends/backend_glfw_imgui.py", line 448, in process
window.dispatch_event('on_draw', dt)
File "/home/greau-hamard/anaconda3/envs/wisp/lib/python3.9/site-packages/glumpy/app/window/event.py", line 396, in dispatch_event
if getattr(self, event_type)(*args):
File "/home/greau-hamard/Téléchargements/kaolin-wisp/wisp/renderer/app/wisp_app.py", line 557, in on_draw
self.render() # Render objects uploaded to GPU
File "/home/greau-hamard/anaconda3/envs/wisp/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/greau-hamard/Téléchargements/kaolin-wisp/wisp/renderer/app/wisp_app.py", line 36, in _enable_amp
return func(self, *args, **kwargs)
File "/home/greau-hamard/Téléchargements/kaolin-wisp/wisp/renderer/app/wisp_app.py", line 525, in render
img, depth_img = self.render_canvas(self.render_core, dt, self.canvas_dirty)
File "/home/greau-hamard/Téléchargements/kaolin-wisp/wisp/renderer/app/wisp_app.py", line 414, in render_canvas
renderbuffer = render_core.render(time_delta, force_render)
File "/home/greau-hamard/anaconda3/envs/wisp/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/greau-hamard/Téléchargements/kaolin-wisp/wisp/renderer/core/render_core.py", line 31, in _enable_amp
return func(self, *args, **kwargs)
File "/home/greau-hamard/Téléchargements/kaolin-wisp/wisp/renderer/core/render_core.py", line 223, in render
rb = self._render_payload(payload, force_render)
File "/home/greau-hamard/Téléchargements/kaolin-wisp/wisp/renderer/core/render_core.py", line 342, in _render_payload
rb = renderer.render(in_rays)
File "/home/greau-hamard/Téléchargements/kaolin-wisp/wisp/renderer/core/renderers/radiance_pipeline_renderer.py", line 71, in render
rb += self.tracer(self.nef,
File "/home/greau-hamard/anaconda3/envs/wisp/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/greau-hamard/Téléchargements/kaolin-wisp/wisp/tracers/base_tracer.py", line 161, in forward
rb = self.trace(nef, rays, requested_channels, requested_extra_channels, **input_args)
File "/home/greau-hamard/Téléchargements/kaolin-wisp/wisp/tracers/packed_rf_tracer.py", line 117, in trace
raymarch_results = nef.grid.raymarch(rays,
File "/home/greau-hamard/Téléchargements/kaolin-wisp/wisp/models/grids/hash_grid.py", line 236, in raymarch
return self.blas.raymarch(rays, raymarch_type=raymarch_type, num_samples=num_samples, level=self.blas.max_level)
File "/home/greau-hamard/Téléchargements/kaolin-wisp/wisp/accelstructs/octree_as.py", line 427, in raymarch
raymarch_results = self._raymarch_uniform(rays=rays, num_samples=num_samples, level=level)
File "/home/greau-hamard/Téléchargements/kaolin-wisp/wisp/accelstructs/octree_as.py", line 356, in _raymarch_uniform
results = wisp_C.ops.uniform_sample_cuda(scale, filtered_ridx.contiguous(), filtered_depth, insum)
RuntimeError: CUDA error: invalid configuration argument
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
The exact function causing the crash is implemented in a Shared Object library, so I have no idea how it works exactly. I checked the data fed into the function, and I do not see any difference between running with or without GUI, and the data shape corresponds to what is expected. I also checked that there is no GPU memory or capacity limitation at the moment of the crash. I do not have any other ideas on what to look at, given the generality of the error message, so could you help me find where the problem comes from?
Hello, I've installed Kaolin-wisp a few days ago, on Ubuntu 20.04 (since it crashes during compilation on Windows). I followed exactly the Quickstart installation (I literally copy-pasted the content). I ran across several problems, which I managed to solve, but now I have been stuck for a day on a non-recurrent error, which almost never happens exactly at the same iteration, but always on the same line of code. This error only happens when I try to launch the app with the GUI, otherwise everything works fine (however it is quite slow, needing 20-30s for validating one FHD image). Note that even when the GUI is successfully launched, the refresh rate is over a second, and nothing in the GUI responds (I can click on buttons or change values in fields, but nothing happens afterwards). Here is the console log :
The exact function causing the crash is implemented in a Shared Object library, so I have no idea how it works exactly. I checked the data fed into the function, and I do not see any difference between running with or without GUI, and the data shape corresponds to what is expected. I also checked that there is no GPU memory or capacity limitation at the moment of the crash. I do not have any other ideas on what to look at, given the generality of the error message, so could you help me find where the problem comes from?