Bugs in evaluation code for processing language.

Silverster98 commented 10 months ago

Hi,

I have noticed that there may be a bug in your modified evaluation code as follows.

https://github.com/GuyTevet/motion-diffusion-model/blob/af061ca7c7077fb144c0094a5a72932b967647b6/data_loaders/humanml/motion_loaders/comp_v6_model_dataset.py#L214

Because the tokens are all padded, if you use len(tokens[bs_i]) to obtain the cap_len, then all sentence lengths will be max_text_len=20 + 2. This will influence the language feature extraction for computing metrics.

And the following code is the original code in HumanML3D, which uses the right token length.

https://github.com/GuyTevet/motion-diffusion-model/blob/af061ca7c7077fb144c0094a5a72932b967647b6/data_loaders/humanml/motion_loaders/comp_v6_model_dataset.py#L100

I think this bug may lead to a performance drop in MatchingScore, R-Precision, and so on.

GuyTevet commented 10 months ago

That's interesting, can you share the performance of the published model after your bug fix?

GuyTevet commented 8 months ago

Fixed. Thanks!

GuyTevet / motion-diffusion-model

Bugs in evaluation code for processing language. #182