Open silence-tang opened 5 months ago
CapHuman can generate full-body images with the help of OpenPose ControlNet, but the 512x512 resolution limitation may result in unsatisfactory details where the face occupies a smaller proportion. To address this, it is recommended to train on higher aspect ratios (e.g., 512x768 or 768x1024) or use a two-stage process involving low-resolution generation followed by high-resolution refinement.
For side-view faces, CapHuman's current bias toward frontal views is likely due to the CelebA dataset's distribution and the coupling of ID extraction with generation targets during training. To generate side faces, one can expand the dataset with multi-angle images, decouple ID extraction from generation. Alternatively, fine-tuning CapHuman with enhanced data could be explored.
Hi, may I ask can caphuman generate full body (from head to feet) image with the help of an openpose controlnet? Another question, it seems that the face of the generated human always looks at front, so can caphuman generate side-view faces? If caphuman can't, is there any way to achieve this requirement?