The correspondence between the codebook and the codebook_embedding

Hi,

I'm glad to read your publication and try your released demo. As for motion generation, the essential item should be the correspondence between the codebook and the codebook_embedding. However, when I checked your code, I found that the CLIL features of the decoded poses of the codebook are not equivalent to those of the codebook_embedding. From Fig, 8 of the paper, I found that the CLIP feature of one pose is the sum of multiple CLIP features of different views of that pose. Would you mind describing more details of how to calculate the codebook and codebook_embeding? If you can release the code for extracting codebook_embedding, I will be more than grateful.

Thank you in advance.

Best wishes, Jack

hongfz16 / AvatarCLIP

The correspondence between the codebook and the codebook_embedding #7