Attention matrix shape - Githubissues

ZcyMonkey / AttT2M

Code of ICCV 2023 paper: "AttT2M: Text-Driven Human Motion Generation with Multi-Perspective Attention Mechanism"

Apache License 2.0

43 stars 3 forks source link

Great work. Thanks for sharing. In the paper it is said that the shape of the adjacancy matrix is (n+1)*(n+1), which sould be 22,22 (21 for joints and 1 for food contact). However, in your implementation in models.encdec.Encoder_spationl class the matrix's shape is 28,28. Also, after that some random vectors add to the encoding (in Abstract_Transformer class). Can you please clarify on these issues, especially the first one? Thanks

As mentioned in our paper, the human body is divided into 5 body parts, which is the "nparts" in your screenshot。 And the extra dimensions in the attention mask are to extract the features of them。

ZcyMonkey / AttT2M

Attention matrix shape #6