Question about sign-language-processing/spoken-to-signed-translation/issues/26

sign-language-processing / pose-to-video

Render pose sequences as photorealistic videos.

8 stars 6 forks source link

Question about sign-language-processing/spoken-to-signed-translation/issues/26 #2

Closed AmitMY closed 5 months ago

AmitMY commented 6 months ago

          Hi, nice work.

If I may ask, which scripts of the pose-to-video for the diffusion models do you use? Have they the holistic bones as input? Or how do know what the expected output of the diffusion model should be?

Originally posted by @florianbaer in https://github.com/sign-language-processing/spoken-to-signed-translation/issues/26#issuecomment-1999039371

AmitMY commented 6 months ago

The command used to generate the video outputs there is

pose_to_video --type=controlnet --model=sign/sd-controlnet-mediapipe --pose=assets/testing-reduced.pose --video=assets/outputs/controlnet-animatediff.mp4 --processors animatediff

It is based on holistic, yes