VamosC / CapHuman

[CVPR2024] CapHuman: Capture Your Moments in Parallel Universes
https://caphuman.github.io
Other
92 stars 7 forks source link

About full body generation and side-view face generation #5

Open silence-tang opened 5 months ago

silence-tang commented 5 months ago

Hi, may I ask can caphuman generate full body (from head to feet) image with the help of an openpose controlnet? Another question, it seems that the face of the generated human always looks at front, so can caphuman generate side-view faces? If caphuman can't, is there any way to achieve this requirement?

VamosC commented 1 week ago
  1. CapHuman can generate full-body images with the help of OpenPose ControlNet, but the 512x512 resolution limitation may result in unsatisfactory details where the face occupies a smaller proportion. To address this, it is recommended to train on higher aspect ratios (e.g., 512x768 or 768x1024) or use a two-stage process involving low-resolution generation followed by high-resolution refinement.

  2. For side-view faces, CapHuman's current bias toward frontal views is likely due to the CelebA dataset's distribution and the coupling of ID extraction with generation targets during training. To generate side faces, one can expand the dataset with multi-angle images, decouple ID extraction from generation. Alternatively, fine-tuning CapHuman with enhanced data could be explored.