Some questions when testing on other dataset.

dcharatan / pixelsplat

[CVPR 2024 Oral, Best Paper Runner-Up] Code for "pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction" by David Charatan, Sizhe Lester Li, Andrea Tagliasacchi, and Vincent Sitzmann

http://davidcharatan.com/pixelsplat/

MIT License

837 stars 57 forks source link

Some questions when testing on other dataset. #33

Closed TQTQliu closed 7 months ago

TQTQliu commented 7 months ago

Thanks for your excellent open-source work. I recently applied your method to the dtu dataset but encountered some confusion. First, I organized the format of camera parameters according to your instructions in readme. And confirmed it by visualizing epipolar lines, as follows: example_00 00_000000 I took the two views closest to the target view as the source views (context views) and rendered the novel view (with four novel views) with your provided pretrained model (re10k.ckpt) as follows： The first novel view is poor, and the last three have some artifacts. For example, for the first novel view, its source views (context views) and GT are as follows: I don't know if this is reasonable, considering that the model is not trained on dtu. Or because your method focuses on a wide baseline image pair, it may not be suitable for the case on DTU. Looking forward to your reply. Best wishes!

dcharatan commented 7 months ago

I wouldn't expect the model to work on DTU out of the box, since DTU is an object-centric dataset with images that don't really resemble the ones in Real Estate 10k or ACID. Here are some things that might be worth trying:

Rescaling the input views to 256x256 to match the resolution the model was trained on
Fine-tuning on DTU or training on a similar object-centric dataset like CO3D (user FantasticOven2 in issue #25 seems to have set up CO3D training; maybe they would be willing to share their code for it)

Hope this at least helps somewhat!

Shahid1Malik commented 1 month ago

@TQTQliu Can you explain how you used those images. I have some images with camera parameters in COLMAP format as well as in the format that Gaussian Splatting Nerf Accepts( Json file). How should I use them and then convert to the format that this model uses?

TQTQliu commented 1 month ago

@Shahid1Malik Hello, since the coordinate conventions for COLMAP and pixelsplat's code are the same (OpenCV style), you can simply modify the intrinsics and extrinsics parameter matrices in dataset file.

MVSGaussian provides a case for the colmap data format that you can refer to and hope it will be helpful.

Shahid1Malik commented 1 month ago

@TQTQliu I have images folder and their respective krts saved in txt files. Can you share the code to change them in the required(MVSplat/pixelsplat) format here.

Shahid1Malik commented 1 month ago

@dcharatan @TQTQliu I also have this confusion, that i would like you to help with. DO i need to change my dataset in. torch files. because i don't have any URLs like Real 10K dataset. I have simple image folder and krts in txt format. How do i use the dataset