Weird output on nuscenes

georghess / neurad-studio

[CVPR2024] NeuRAD: Neural Rendering for Autonomous Driving

https://research.zenseact.com/publications/neurad/

Apache License 2.0

329 stars 23 forks source link

Weird output on nuscenes #30

Closed geekJZY closed 3 weeks ago

geekJZY commented 3 months ago

Hi,

Describe the bug Thanks for your contribution for this great work. I tried it out on Nuscenes, but I get some weird outputs with low quality. So I want to double check with you.

To Reproduce

I train the model on Nuscenes with

CUDA_VISIBLE_DEVICES=0 python nerfstudio/scripts/train.py neurader \
--pipeline.datamanager.num_processes=0 --vis tensorboard nuscenes-data

Then I generated the video with

CUDA_VISIBLE_DEVICES=0 python nerfstudio/scripts/render.py interpolate \
--output_path ./renders/neurader.mp4 \
--load_config /net/acadia14a/data/zjiang/projects/Diffusion/code/neurad-studio/outputs/unnamed/neurader/2024-06-03_112521/config.yml

Expected behavior I expect the model to generate images with the quality shown in paper.

Screenshots Here is the screenshots about the results I got.

https://github.com/georghess/neurad-studio/assets/31909301/f6809578-277a-468f-be01-0462dc90243b

atonderski commented 3 months ago

There seem to multiple weird things going on here.

First, initial rendering quality is atrocious, need to investigate what is going on here. Since nuscenes lacks full pose information (z-component is missing), and the initial part is slightly downhill, I suspect this sequence requires pose optimization. Please try a training with --pipeline.model.camera_optimizer.mode=S03xR3

Secondly, when the static scene starts to look more reasonable (second half of video where the road has flattened out), the actors look really bad. This seems to be due time being frozen, which is controlled by the rendering script. I will look into this.

atonderski commented 3 months ago

Actors stopping was due to a poorly chosen default value in the rendering script. This was fixed here, so if you rerun on latest master the second half of the sequence should look much better :)

geekJZY commented 3 months ago

Thanks for your reply! I have rerun the last master. Specifically, I add "camera_optimizer" by using neurad-scaleopt config.

Specifically, I train the model on Nuscenes with

CUDA_VISIBLE_DEVICES=0 python nerfstudio/scripts/train.py neurad-scaleopt \
--pipeline.datamanager.num_processes=0 --vis tensorboard nuscenes-data

Then I generated the video with

CUDA_VISIBLE_DEVICES=0 python nerfstudio/scripts/render.py interpolate \
--output_path ./renders/neurader.mp4 \
--load_config outputs/nuscene/neurad-scaleopt/2024-06-11_110714/config.yml

This is the video I get:

https://github.com/georghess/neurad-studio/assets/31909301/faceebcd-a704-43aa-b380-785f1a11503a

It is becoming better then the previous version. But there are still some artifacts.. Please let me know if there are other methods to improve the quality. Thanks!

Andyshen555 commented 2 months ago

@atonderski When I tried to reconstruct NuScenes videos, I figured out that it is blurry and flickering. Also, the vehicles shake left and right. My training configures are below:

--max-num-iterations 60001\
--machine.num-devices 2\
--pipeline.model.eval_num_rays_per_chunk 32768 \
--pipeline.datamanager.num_processes 8 \
--pipeline.datamanager.train-num-lidar-rays-per-batch 16384 \
--pipeline.datamanager.eval-num-lidar-rays-per-batch 8192 \
--pipeline.datamanager.train-num-rays-per-batch 49152 \
--pipeline.datamanager.eval-num-rays-per-batch 49152 \
--pipeline.datamanager.train-num-images-to-sample-from 128 \
--pipeline.datamanager.train-num-times-to-repeat-images 256 \
--pipeline.model.sampling.proposal-field-1.grid.static.log2-hashmap-size 22\
--pipeline.model.camera-optimizer.mode SO3xR3 \
nuscenes-data \
--data=/mnt/ssd/NIO/data/nuScene-mini \
--sequence=scene-0061\
--dataset_end_fraction=1.0 \
--version=v1.0-mini \
--cameras=all \

I used the interpolate for the render and it looks like :

https://github.com/user-attachments/assets/b715e68f-fb7f-4fce-a0f8-1aa35474abba

Can you please provide some insight to what I am doing wrong here?

Also, it would be helpful if you can provide your training config for NuScene in your paper. Thank you.

atonderski commented 2 months ago

NuScenes is finicky due to sometimes extremely wrong z-component in the poses. You can used the neurad-scaleopt for a version which specifically targets the height component during optimization, we have found slightly better results with that.

As for how to reproduce the results in the paper we exactly followed the evaluation protocol of S-NeRF, which selects some nicely behaved sequences and trains only on the middle part of those sequences.

On your sequence neurad with default settings and no camera optimization should get roughly the following metrics: psnr: 25.479, lpips: 0.3411, 0.7622

Finally, we don't enforce smooth vehicle motion, which could explain the shakyness. It should be relatively straightforward to regularize the motion of actors to be smooth, please open a pull request if you implement that :)