Closed fanqinshan0114 closed 3 weeks ago
Hi,
Good question. Intuitively, we select the best fid model on the validation set for the best motion quality. For mask transformer, ideally, we want to consider the metrics comprehensively, such as R-Precision. Picking the latest checkpoint is just simple and easy, that empirically gives good performance.
We evaluated the method using the checkpoints and scripts we shared. For details, the epoch id is recorded in the checkpoint (key: ‘ep’).
Hi,
Good question. Intuitively, we select the best fid model on the validation set for the best motion quality. For mask transformer, ideally, we want to consider the metrics comprehensively, such as R-Precision. Picking the latest checkpoint is just simple and easy, that empirically gives good performance.
We evaluated the method using the checkpoints and scripts we shared. For details, the epoch id is recorded in the checkpoint (key: ‘ep’).
If I want to train momask base, i.e. without using residual transformer. Should I still choose latest.rar for masked transformer when evaluating?
I want to know why the Masked Transformer selected by default when evaluating Text2motion Generation is latest.tar instead of net_best_fid.tar, while RVQ and Residual Transformer select net_best_fid.tar. May I know which one you are actually using in your evaluation?