shaking for start and target when I training my self dataset? - Githubissues

jihoonerd / Conditional-Motion-In-Betweening

🕹️ Official Implementation of Conditional Motion In-betweening (CMIB) 🏃

https://jihoonerd.github.io/Conditional-Motion-In-Betweening/

120 stars 8 forks source link

shaking for start and target when I training my self dataset? #49

Closed miaoYuanyuan closed 2 years ago

miaoYuanyuan commented 2 years ago

when I trained myself data for 175epoch , I found the result sequence joint with start and target will suddenly shake. I wan't to know , How can reduce this phenomenon?

holyhao commented 2 years ago

@miaoYuanyuan hi, i trained the model on lafan, it takes about 8min/epoch. It will take about a mouth to train totally 5000 Epoches. I wonder if there is something wrong with my settting. Additionly, Pytorch 1.8 is used.

jihoonerd commented 2 years ago

@miaoYuanyuan TL; DR: Easy remedy: further training, More fundamental one: use 6d representation.

I also experienced the sudden flip during the training. Most of the case, simply further training solved the issue. I believe it comes from discontinuity in quaternion since the phenomenon greatly suppressed the 6D rotation setting.

jihoonerd commented 2 years ago

@holyhao You don't need to train entire 5,000 epochs. Usually near 1000 epochs was sufficient for my case. I used distributed training to reduce training time (took 3~4 days). If you want to use model w/o training, please consider using pretrained weights in README.md.

holyhao commented 2 years ago

@jihoonerd Thanks for your reply. I set epoch at 1000 and batchsize at 1280 to reduce training time. The lr is seted as 0.0001*40. The loss Position Loss only decreasese from 2.66 to 2.60. And, the Rotation loss only decreasese from 0.75 to 0.63. The Condition loss looks like normal from 0.99 to about zero. Are the Position and Rotation Loss correct ?

jihoonerd commented 2 years ago

@holyhao Changing batchsize and lr may affect the training. Conditional loss gets to nearly zero at the convergence. So it is fine. Scale depends on the length of frames you are using, but positiona and conditional loss tends to have far lower value than that of rotation. However, it is difficult to guarantee the end quality with loss values. I would recommend you to visualize them.

miaoYuanyuan commented 2 years ago

Thank you for your replay. I will training with longer time. another question: I use your provided pretrained model and test another animation which not belong to lafan data, I find the speed around the target and start is not coherent. but result on LAFAN dataset will performance well. so I guess the pretrained model is over-fit on the LAFAN dataset ? or is the reason that speed is not considered in this work ? render

jihoonerd commented 2 years ago

@miaoYuanyuan The training pipeline is targeted for LAFAN in terms of scale. This model takes positional values and rotational values in global coordinate system only, so it does not consider joint velocity. However, LAFAN's scale is match with real-world scale. This means that if starting and target poses are set to be feasible, then the pretrained model should work.

miaoYuanyuan commented 2 years ago

yes, I retarget myself data to LAFAN skeleton, and choose any start and target. so I think the scale is no problem . for joint velocity , how shoud I to judge the target and start poses is feasible or not? I think if I give them enough distance for inbetween and it should learn how to adjust the gait and speed.

jihoonerd commented 2 years ago

@miaoYuanyuan About figuring out feasible poses, it is difficult, and that's the reason why I picked the poses from test datasets, which are guaranteed to be feasible. I used Unity engine to manually find proper distance/poses. And that's a good point. I guess 80-frames in-betweening setting will have trouble if it is given very short start-target distance and 30-frames in-betweening setting will also do in long start-target distance. So, trying different in-betweening frames might help.

miaoYuanyuan commented 2 years ago

@jihoonerd The suitable distance may be related with the trainig data distribution , as well as the real-world situation. I will try it, Thank you very much!

holyhao commented 2 years ago

@jihoonerd Hi, another question, The results of the run_cmib.py are json, how can i convert them to BVH?

jihoonerd commented 2 years ago

@holyhao Output jsons have local quaternion list, so you can controvert it to BVH manually. I think replace LAFAN1's motion part with predicted values will work. I visualized the output in README.md by Unity, so I don't have BVH converter from those jsons.

icedwater commented 1 month ago

@jihoonerd Hi, another question, The results of the run_cmib.py are json, how can i convert them to BVH?

Hi @holyhao, have you found a method to do this? I came here to ask another question but I saw this; I have something that kind of works, but I'll need to clean it up before sharing it.