quan-meng / gnerf

[ ICCV 2021 Oral ] Our method can estimate camera poses and neural radiance fields jointly when the cameras are initialized at random poses in complex scenarios (outside-in scenes, even with less texture or intense noise )
MIT License
226 stars 27 forks source link

Pose Distribution Prior #2

Open hmdolatabadi opened 2 years ago

hmdolatabadi commented 2 years ago

Dear Authors,

Thanks for the interesting work and releasing the code. I was wondering about the advice that you've put in the readme on training with our own data. Assuming that I only have an image dataset, how can I 1) find this suitable prior distribution, 2) train your model on it?

Thanks for your help in advance.

quan-meng commented 2 years ago

Really good questions!

Many scenes can be categorized into three cases: forward-facing, inward-facing, and randomly distributed scenes.

For forward-facing scenes, the camera poses can be directly optimized via gradient descent, so we don't have to find a prior distribution.

For inward-facing scenes, the prior distribution of many scenes can be roughly represented with cameras position distributed uniformly on a sphere surface or in a spherical shell, and lookat point around the origin. Then, we have to tune the parameters of the region of the sphere radius, azimuth, elevation, lookat point.

For randomly distributed camera poses, like in a room, it's hard to define the distribution with a few parameters, we may need other poses estimation tools to determine a rough camera region for each image, then we can set the prior distribution as the union of all regions.

e4s2022 commented 2 years ago

Hi, I wonder if possible to train a GNeRF model from a raw video, say, a TV news video, or a speech video that shows a little dynamics.

image

I think the pose changes in a small range, so I set the pose distribution as:

data: obama
img_wh: [ 450, 450 ]
data_dir: ~
azim_range: [ 0., 90. ]  # the range of azimuth
elev_range: [ 0., 30. ]   # the range of elevation
radius: [ 4.0, 4.0 ]  # the range of radius
near: 2.0
far: 6.0
white_back: True
ndc: False
look_at_origin: True
pose_mode: '3d'

But I found the generator could not produce valid images. At first, the generated pixels are like noises, then quickly converge to red color.

image

And the losses are like:

image

Any ideas to tune the pose range? Thanks in advance.