lucidrains / titok-pytorch

Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"
MIT License
159 stars 3 forks source link

Quantized latents are not used at all? #1

Closed inspirit closed 2 months ago

inspirit commented 2 months ago

Hi Phil, are you sure that we pass original latents to decoder instead of quantized? https://github.com/lucidrains/titok-pytorch/blob/8e56054a9306edcab4643f5c6279ca4061057581/titok_pytorch/titok.py#L185

i looked through the paper and they decode using mask tokens + quantized latents

lucidrains commented 2 months ago

@inspirit hey! thanks as usual 😄