mingyuan-zhang / MotionDiffuse

MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model
https://mingyuan-zhang.github.io/projects/MotionDiffuse.html
Other
835 stars 74 forks source link

Question on number of frames #17

Open n12iki opened 1 year ago

n12iki commented 1 year ago

Hi I noticed that for generating the animation you have a limit of 196 frames, is this just so that it provides a quick result?If it is a limitation of the current model, would it possible to train a model to do more frames? I have been doing a quick look but couldn't find a limit in the training anywhere.

Also am I correct in understanding that the dim_pose variable is the number of unique poses in the dataset? Thanks

mingyuan-zhang commented 1 year ago

dim_pose is the dimension of each frame, which includes the 3D coordinates, velocity, rotation vection of each joint. The length limitation 196 is hard-code at here.

Currently, most data in the HumanML3D is smaller than 196 frames. If you want to train the model on your own dataset, you may change this limitation both in tool/train.py and datasets/dataset.py .

If you just want to generate a longer sequence with HumanML3D dataset, some tricks in video diffusion models can support it. I will update it in a few days.

zhuangzhuang000 commented 1 year ago

dim_pose是每一帧的维度,包含了每个关节的三维坐标、速度、旋转向量。长度限制 196 在这里是硬编码的。

目前,HumanML3D 中的大多数数据小于 196 帧。如果您想在自己的数据集上训练模型,您可以在tool/train.py和中更改此限制datasets/dataset.py

如果你只想用 HumanML3D 数据集生成更长的序列,视频扩散模型中的一些技巧可以支持它。我会在几天内更新它。

Hello, your work is really excellent! Could you provide me with the code in the section on "Time-varied Controlling" in your paper?