Gengzigang / PCT

This is an official implementation of our CVPR 2023 paper "Human Pose as Compositional Tokens" (https://arxiv.org/pdf/2303.11638.pdf)
MIT License
332 stars 21 forks source link

Tokens to be integers #38

Open Tommy-Hsu opened 5 months ago

Tommy-Hsu commented 5 months ago

Hello, I would like to ask about the meaning of tokens being integers. I noticed that the final forward pass to the tokenizer involves the cls_logits_softmax tensor, and it directly performs a matrix multiplication with the codebook. However, these operations are all in floating-point. So, what does it mean for tokens to be integers in classifier stage?

ndsl555 commented 5 months ago

I'd like to know too

KunmingS commented 4 months ago

Seems like the integer token only happens when stage I training. I think is the variable encoding_indices is this line: https://github.com/Gengzigang/PCT/blob/main/models/pct_tokenizer.py#L142

Tommy-Hsu commented 4 months ago

Seems like the integer token only happens when stage I training. I think is the variable encoding_indices is this line: https://github.com/Gengzigang/PCT/blob/main/models/pct_tokenizer.py#L142

That's true. In stage 1, the encoding_indices would be integers but not in stage 2.

Tommy-Hsu commented 4 months ago

image

Figure 1 is quite confusing to me. In the reference stage, the class head output should be logits, and the codebook context is composed of floating-point data. However, this image shows that both are composed of integers, which is what I find puzzling.