Open SuzyZhao0417 opened 4 months ago
Hi @SuzyZhao0417 , currently our method mainly deals with the human-centered scene. Our method needs GT depth for training and it is not straightforward to synthesize training data with GT depth of background regions. We have tried to train GPS-Gaussian without depth supervision which enables the application on more general scenarios (with background) such as LLFF and ENeRF outdoor dataset. By making some modifications and attaching additional regularization terms, it works. Thanks to the fully differentiable framework, our method can achieve a competitive rendering performance even with a degenerate geometry proxy rebuilt from an unsupervised depth estimation. However, this degrades the quality of the live demo, and I think more efforts should be paid to improve the performance.
If I want to test on LLFF data what format do I need to process the data?
You could try to format the LLFF dataset as our validation set in render data. You should try to select two horizontal source views as 0.jpg (left) and 1.jpg (right) to formulate two-view stereo. As for novel views, rename the desired camera parameters as our example data (e.g. 2_extrinsic.npy, 2_intrinsic.npy, 3_extrinsic.npy, 3_intrinsic.npy...). The extrinsic is in W2C format.
@ShunyuanZheng did you understand the format of the extrinsic params? W2C?
@ShunyuanZheng did you understand the format of the extrinsic params? W2C?
Yes, the provide extrinsic is in world-to-camera format in OpenCV coordinate.
Hello! Thanks for your excellent work! But I have some question about the dataset. Does the method only apply to humans dealing with ring scenes and can it be used to deal with LLFF data?As it is mentioned in the paper that the purposed method need to select the left and right cameras as source views and relies on the results of binocular depth estimation. Looking forward to your reply! Best wishes!