Is the text encoder in TMR fine-tuned using the BERT model, or is it retrained using a transformer?

IDEA-Research / HumanTOMATO

[ICML 2024] 🍅HumanTOMATO: Text-aligned Whole-body Motion Generation

https://lhchen.top/HumanTOMATO

Other

240 stars 6 forks source link

Is the text encoder in TMR fine-tuned using the BERT model, or is it retrained using a transformer? #19

Closed caochengchen closed 3 weeks ago

LinghaoChan commented 3 weeks ago

It is a pre-trained model + transformer layers architecture. Please refer to the technical details in the appendix.

caochengchen commented 3 weeks ago

It is a pre-trained model + transformer layers architecture. Please refer to the technical details in the appendix.

I didn't see what text encoder is used in TMR in the appendix. I only saw that you mentioned sBERT in the paper. Is the text encoder in TMR fine-tuned using the BERT model, or is it retrained using a transformer?

LinghaoChan commented 3 weeks ago

@caochengchen I stated that it is extended from the text encoder of TEMOS. TEMOS use the DistilBERT + tf layers as the text encoder.

caochengchen commented 3 weeks ago

@caochengchen I stated that it is extended from the text encoder of TEMOS. TEMOS use the DistilBERT + tf layers as the text encoder.

Thank you very much