EricGuo5513 / momask-codes

Official implementation of "MoMask: Generative Masked Modeling of 3D Human Motions (CVPR2024)"
https://ericguo5513.github.io/momask/
MIT License
690 stars 56 forks source link

Select latest.tar or net_best_fid.tar? #47

Open fanqinshan0114 opened 1 month ago

fanqinshan0114 commented 1 month ago

I want to know why the Masked Transformer selected by default when evaluating Text2motion Generation is latest.tar instead of net_best_fid.tar, while RVQ and Residual Transformer select net_best_fid.tar. May I know which one you are actually using in your evaluation?

Murrol commented 1 month ago

Hi,

Good question. Intuitively, we select the best fid model on the validation set for the best motion quality. For mask transformer, ideally, we want to consider the metrics comprehensively, such as R-Precision. Picking the latest checkpoint is just simple and easy, that empirically gives good performance.

We evaluated the method using the checkpoints and scripts we shared. For details, the epoch id is recorded in the checkpoint (key: ‘ep’).

fanqinshan0114 commented 1 month ago

Hi,

Good question. Intuitively, we select the best fid model on the validation set for the best motion quality. For mask transformer, ideally, we want to consider the metrics comprehensively, such as R-Precision. Picking the latest checkpoint is just simple and easy, that empirically gives good performance.

We evaluated the method using the checkpoints and scripts we shared. For details, the epoch id is recorded in the checkpoint (key: ‘ep’).

If I want to train momask base, i.e. without using residual transformer. Should I still choose latest.rar for masked transformer when evaluating?