vincentfung13 / MINE

Code and models for our ICCV 2021 paper "MINE: Towards Continuous Depth MPI with NeRF for Novel View Synthesis"
MIT License
408 stars 43 forks source link

how to prepear my dataset? #15

Open ccsvd opened 2 years ago

ccsvd commented 2 years ago

hi, thanks for your good job! but if i want to train my data? how to process? i see the llff data have cameras.bin images.bin,points3D.bin。。。how to get these? could you share the code for that? Thanks.

vincentfung13 commented 2 years ago

You will need multi-view images in order to train MINE, this includes camera parameters for each image in the scene, as well as a sparse point cloud of the scene for scale calibration in the case that the camera parameters are estimated with structure-from-motion.

Sometimes some of the parameters are proveded, for example RealEstate10K provideds the camera intrinsics and extrinsics, but since they are estimated with SfM, you will need to run a triangulation to generate the point clouds, you can take a look at the point_triangulator interface of colmap: https://colmap.github.io/cli.html

If you are starting from scratch, you can automatically estimate all of the parameters with automatic_reconstructor, which will generate all the bin files in your question.

Hope this helps.

Zijian

ccsvd commented 2 years ago

这是来自QQ邮箱的假期自动回复邮件。 您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。

ccsvd commented 2 years ago

ok,i will check and try! well,i have some new questions: 1、could you share the training.log file for llff model train?so i can check my train is ok. 2、what is mean the param img_pre_downsample_ratio?it is fixed for any dataset? 3、the llff image size is 512x284, it has to be the same as mode input size? or it can be any size in my new dataset? thanks for reply!

anuraguppuluri commented 2 years ago

You will need multi-view images in order to train MINE, this includes camera parameters for each image in the scene, as well as a sparse point cloud of the scene for scale calibration in the case that the camera parameters are estimated with structure-from-motion.

Sometimes some of the parameters are proveded, for example RealEstate10K provideds the camera intrinsics and extrinsics, but since they are estimated with SfM, you will need to run a triangulation to generate the point clouds, you can take a look at the point_triangulator interface of colmap: https://colmap.github.io/cli.html

If you are starting from scratch, you can automatically estimate all of the parameters with automatic_reconstructor, which will generate all the bin files in your question.

Hope this helps.

Zijian

Are two views per scene sufficient for training MINE on thousands of scenes at a time?

ccsvd commented 2 years ago

这是来自QQ邮箱的假期自动回复邮件。 您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。

ccsvd commented 1 year ago

这是来自QQ邮箱的假期自动回复邮件。 您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。

tedyhabtegebrial commented 1 year ago

@vincentfung13 Do you have any suggestions on training without SFM, incase if we have a good estimate of the scene's near and far extent, already ?

ccsvd commented 1 year ago

这是来自QQ邮箱的假期自动回复邮件。 您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。