EricGuo5513 / momask-codes

Official implementation of "MoMask: Generative Masked Modeling of 3D Human Motions (CVPR2024)"
https://ericguo5513.github.io/momask/
MIT License
856 stars 73 forks source link

Select latest.tar or net_best_fid.tar? #47

Closed fanqinshan0114 closed 3 weeks ago

fanqinshan0114 commented 6 months ago

I want to know why the Masked Transformer selected by default when evaluating Text2motion Generation is latest.tar instead of net_best_fid.tar, while RVQ and Residual Transformer select net_best_fid.tar. May I know which one you are actually using in your evaluation?

Murrol commented 6 months ago

Hi,

Good question. Intuitively, we select the best fid model on the validation set for the best motion quality. For mask transformer, ideally, we want to consider the metrics comprehensively, such as R-Precision. Picking the latest checkpoint is just simple and easy, that empirically gives good performance.

We evaluated the method using the checkpoints and scripts we shared. For details, the epoch id is recorded in the checkpoint (key: ‘ep’).

fanqinshan0114 commented 6 months ago

Hi,

Good question. Intuitively, we select the best fid model on the validation set for the best motion quality. For mask transformer, ideally, we want to consider the metrics comprehensively, such as R-Precision. Picking the latest checkpoint is just simple and easy, that empirically gives good performance.

We evaluated the method using the checkpoints and scripts we shared. For details, the epoch id is recorded in the checkpoint (key: ‘ep’).

If I want to train momask base, i.e. without using residual transformer. Should I still choose latest.rar for masked transformer when evaluating?