creiser / kilonerf

Code for KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs
471 stars 52 forks source link

Resolution in pretrain_occupancy cfg file #9

Closed DavidParamo closed 2 years ago

DavidParamo commented 2 years ago

Hi! First of all, I would like to thank you for your incredible work speeding up NeRF.

I'm trying to train a new model, but I'm struggling with the configuration. I've read the paper but I can't seem to understand what 'resolution' on cfg files stands for, so I'm not sure which values I should have there for my model. If you could guide me a little with this I would be immensely grateful.

Thank you.

creiser commented 2 years ago

Thank you for your interest in our work.

The values defined in cfgs/paper/pretrain_occupancy/SCENE.yaml specify the resolution of the occupancy grid that is extracted from the pretrained NeRF model. The actual network resolution (= how many tiny MLPs are used) is specified in cfgs/paper/distill/SCENE.yaml and the config key is called "fixed_resolution". As you can see from the config files the resolution of the occupancy grid is always 16 times the network resolution.

DavidParamo commented 2 years ago

I'm sorry for coming back to this now, but I've been working on other parts of the project because this has been my biggest issue so far. I now understand that "fixed_resolution" is set based on the dimensions of the scene, but I don't understand how you get those dimensions for a real scene. I have observed that all the scenes you have worked with are scanned or synthetic, so I was wondering if this is even possible without knowing the real dimensions of the scene (in my case, I'm working with human heads).

Also, I want to adapt this for dynamic nerf scenes (eg. talking faces) but it has been overwhelming so far. I would love to know your thoughts on whether this is possible and what parts do you think I should try to change. I'm a little bit lost inside all that CUDA code

Thank you for your time and your awesome work