Can style features be used to replace text features for style control？

Garfield-kh / TM2D

[ICCV 2023] TM2D: Bimodality Driven 3D Dance Generation via Music-Text Integration

77 stars 3 forks source link

Can style features be used to replace text features for style control？ #18

Open mengW6 opened 1 month ago

mengW6 commented 1 month ago

A very outstanding job, I have a few questions to ask:

May I ask which code is used to test the relevant indicators after training train_atm_transformr_v5?
Do I just need to replace the text features with the extracted style features to achieve style controlled music generated dance? Do I need any other modifications? What parts should be noted when making modifications? I am looking forward to your guidance very much.

mengW6 commented 1 month ago

Hello, may I ask which code was used to generate the indicator results marked in the above image?

Garfield-kh commented 2 weeks ago

Hi, can you download the tm2d_60fps.zip in [google drive], it contains \tm2d_60fps.zip\tm2d_60fps\eval4bailando\evaluate_music2dance_aistpp.py for metric performance.

Garfield-kh commented 2 weeks ago

for "style control", I think the signal should be globally impose into the dance feature, this one is locally added in.

mengW6 commented 2 weeks ago

Hi, can you download the tm2d_60fps.zip in [google drive], it contains \tm2d_60fps.zip\tm2d_60fps\eval4bailando\evaluate_music2dance_aistpp.py for metric performance.

I followed the steps and code you provided for testing, but I couldn't get the data you provided in the article? Also, which Epoch was used to evaluate the data provided in your article?

Also, in the evaluation code you provided, Epoch 36 was given. I would like to ask why Epoch 36 was chosen for the model evaluation of music generated dance? Specifically, based on what criteria was Epoch 36 selected as the model performance result?

I tested the performance on epoch 36: tm2d_60fps. zip \ tm2d_60fps \ evaluat4bailando \ evaluate_music2dance_aistpp.py to evaluate the model, but the results were different each time. What should I do in this situation? Can you give me some guidance? Thanks!

Garfield-kh commented 2 weeks ago

The FID test is not very stable (sometimes FIDk is high, FIDg is low), and it doesn’t align well with the visualized performance (see Fig. 5 of the EDGE paper for reference). Therefore, we chose an epoch that looked acceptable in both aspects.

I provided a checkpoint, but it’s from a training session conducted on an external computer, so the metrics are approximately similar. The specific checkpoint used in the paper is on an internal network and cannot be exported.

For further suggestions, you might consider if EDGE’s metrics are suitable, though I noticed that while testing EDGE, it still had issues with feet skating.