Open Haven-Lau opened 2 weeks ago
@Haven-Lau , I have seen a similar phenomena in my early experiments. What I found was that the viewer can crash if you move the camera in the viewer so that it does not see the Gaussian scene properly (like turning 180 degrees and looking at nothing). I wonder if this is the same issue for you.
@maturk Thanks for the quick reply!
Yes it does sound similar, I find it to crash regardless of where I'm looking (eventually if I have the viewer running) but it is definitely more likely to crash when I pan around a lot very quickly or stare at nothing. I wonder if it's some race conditions between ns-train dn-splatter and the viewer. However today I had my first crash crash without running the viewer. Is there a way to save checkpoints throughout the training process instead of only at 100%?
Just to make sure, you are using nerfstudio v1.1.3 and gsplat v1.0.0?
Correct
# Name Version Build Channel
nerfstudio 1.1.3 pypi_0 pypi
gsplat 1.0.0 pypi_0 pypi
I'm running windows hopefully that's not the cause, but I can try to spin up an ubuntu env at some point since I couldn't get the download scripts running on windows anyways due to the cli commands on different OS I think
Have u tried any other dataset if it occurs there? I am wondering if there are some issues with the optimization (densification/culling) due to the depth supervision. Pictures of the scene when or near the crash might help me debug as well. Maybe try turn off depth loss and see if crash still happens in that scenario.
For my own scene I turned on --pipeline.model.use-normal-loss True --pipeline.model.use-normal-tv-loss True
and that caused it to crash at 51% (15400 steps), without normal loss it trains to 100% without crashing; I'm not loading any normal maps.
This is what it looked like before ~1000 steps before the crash (this time with viewer on it crashed at 12xxx steps intead of 15xxx)
This is one of the training input
I'll try training again with one of the mushroom dataset and report back
I tried using mushroom honka kinect short raw
dataset and processed the raw camera and depth mkv using process_sai.py
, then ran
> ns-train dn-splatter --data data\honka_processed
--pipeline.model.use-depth-loss True
--pipeline.model.depth-lambda 0.2
--pipeline.model.use-normal-loss True
--pipeline.model.use-normal-tv-loss True
--pipeline.model.normal-supervision depth
normal-nerfstudio --load-normals False
This time I was able to use normal-loss and normal-tv-loss without crashing
However this time I saw similar degrading issue as https://github.com/maturk/dn-splatter/issues/68 where towards the end of the training process a bunch of big splats got introduced and the some surfaces now have holes Could you see if you can reproduce similar issues with the mushroom dataset using the same steps?
Eventually I want to figure out how to process my own raw kinect data using the same steps as demonstrated in the mushroom paper as the output from that seems to be very good, there seems to be quite a gap between processing raw kinect data using process_sai
tool vs the preprocessed kinect data given by mushroom dataset, or perhaps do you think it is my ns-train params?
Hi, @Haven-Lau may I ask which camera pose you use for mushroom dataset, mushroom dataparsers in dn-splatter also support kinect sequence, the command can like:
ns-train dn-splatter --data mushroom_sequence
--pipeline.model.use-depth-loss True
--pipeline.model.depth-lambda 0.2
--pipeline.model.use-normal-loss True
--pipeline.model.use-normal-tv-loss True
--pipeline.model.normal-supervision depth
mushroom --load-normals False --mode kinect
Hi, first of all just wanted to thank you for this amazing project! I've wanted to leverage depth camera as a prior for training gs for a while and can't believe it took me this long to stumble upon this project.
I'm currently facing this issue when training with nerfstudio viewer on:
My data was captured with an azure kinect sensor using SAI, at first I used the included
process_sai.py
to preprocess the recorded data, but the transform.json output gave camera intrinsics that nerfstudio's undistort function didn't like where k4 wasn't 0, so I copied the camera intrinsic values fromsai-cli process
(which gave k1 k2 p1 p2 = 0, not sure the intrinsic values are important on the kinect) and training starts properly now.When I run
ns-train
while having the nerfstudio web viewer on, at around 5000 - 7000 steps it would throw a CUDA illegal memory access error, however without the web viewer running it would run without complaining. I've tried running ns-train multiple times with and without the web viewer and it only complains when web viewer is running. Has anyone seen similar behaviors?System info:
Thanks again