ChenFengYe / motion-latent-diffusion

[CVPR 2023] Executing your Commands via Motion Diffusion in Latent Space, a fast and high-quality motion diffusion model
https://chenxin.tech/mld/
MIT License
586 stars 55 forks source link

Demo.py Frame Rate and Number of Generated Samples #8

Open mmdrahmani opened 1 year ago

mmdrahmani commented 1 year ago

Hi I have two questions.

1- Is it possible to know the default frame rate used for generation of actions? I could see that frame rate is defined in multiple places, but when I am visualizing the generated movements using opencv, the actions seems slower than normal. I have used multiple frame rates for rendering in opencv, and it seems that 35Hz gives a normal action speed.

2- Is it possible to define number of samples per action to be generated? For example, I'd like to generate 100 samples of 'a person kneels.' Currently I am created a action_list.txt file with 100 lines of text 'a person kneels.' The demo.py generates 100 separate .npy files. Instead I wanted to generate a tensor of size [num_sample, num_frames, num_joints, xyz]. Is this possible.

Thank you for your support. Mohammad

52PengUin commented 1 year ago

Hi @mmdrahmani

  1. By default, the rendered video frame rate is 20.0, controlled by the cfg.RENDER.FPS in configs/render_mld.yaml. We think the default setting is more suitable to see the motion quality in details. But if you find the frame rate of rendered videos too slow, you can modify the config below. https://github.com/ChenFengYe/motion-latent-diffusion/blob/30708c0d077f2232482b4579a5931a2024b021c8/configs/render_mld.yaml#L15
  2. Thanks for your advice. Recently we add 2 new flags for the demo. --replication makes it possible generate motions for the same input texts multiple times. --allinone allows you to store all generated motions in a single npy file with the shape of [num_samples, num_replication, num_frames, num_joints, xyz]. And you can use the flags above like
    python demo.py --cfg ./configs/config_mld_humanml3d.yaml --cfg_assets ./configs/assets.yaml --example ./demo/example.txt --replication 100 --allinone
mmdrahmani commented 1 year ago

Thank you for the reply. Excellent! The new flags are very useful. Regarding the frame rate, my question was about the frame rate of the .npy files that are returned by running demo.py (npy files saved in the results folder). I defined 196 frames (MAX_LEN), but I can't figure out the appropriate frame rate at which these 196 frames are sampled. Thanks for the support. Mohammad

mmdrahmani commented 1 year ago

Hi @52PengUin Any ideas about the duration in seconds of the generated 196 frames saved in .npy file? Is it 10 seconds? In other words, is the frame-rate of 196 frames saved in .npy file ~20Hz (196/10)?

52PengUin commented 1 year ago

Hi @mmdrahmani

Since npy is just a format for storing data, the frame rate of the motion saved in the .npy file is supposed to be the same as the motion that was sampled and processed.

According to the official repo of the HumanML3D dataset, the frame rate is 20.0 frames per second.

Each motion clip in HumanML3D comes with 3-4 single sentence descriptions annotated on Amazon Mechanical Turk. Motions are downsampled into 20 fps, with each clip lasting from 2 to 10 seconds.

Therefore, the duration in seconds of the generated 196 frames is 9.8s (196/20).