YuelangX / Gaussian-Head-Avatar

[CVPR 2024] Official repository for "Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians"
Other
703 stars 40 forks source link

Gaussian Head training loading images with padding/offset #28

Closed CscotfordMV closed 3 weeks ago

CscotfordMV commented 1 month ago

I've managed to train the MeshHead on some custom data and it worked really well, results looked great. However, when I try and train the GaussianHead model afterwards, it seems all the results saved (and I presume the images it's loading in) are being saved with a large amount of padding around the top and left side so the image is shifted a great deal being replaced with just black. I believe I have edited the configs correctly (very little actually as I started with the same resolution size as the mini_demo_dataset - 2080x2080). I should add too that the code works perfectly well with the mini-demo-dataset you provide so the issue isn't any changes I have made myself. Where is this padding coming from?

Screenshot 2024-05-26 at 10 26 03
YuelangX commented 1 month ago

Since the code of the Gaussian renderer is copied from 3D Gaussian Splatting, I cropped the image and modified the intrinsic in lib.dataset.Dataset.GaussianDataset, so that the camera principal point is at the center of the image. I plan to remove this unnecessary operation in the next few days.

NikoBele1 commented 1 month ago

@CscotfordMV How did you extract the camera parameters of your custom data? When i train on my custom data, with camera parameters extracted via reality capture, the train_meshhead step fails and throws the error: Calculated padded input size per channel: (0). Kernel size: (1). Kernel size can't be greater than actual input size All my input sizes are the same as in the demo dataset, so i thought it may be a problem with the camera parameters, that cause problems downstream. When i plot my vertices from subject/params/*/vertices.npy i can see their orientation being wrong compared to the demo data

YuelangX commented 1 month ago

@NikoBele1 I use the camera parameters provided by the NeRSemble data. I guess they are calibrated using checkerboard by the opencv function. If the camera parameters are wrong, it may be caused by inconsistent coordinate systems.