AILab-CVC / SEED

Official implementation of SEED-LLaMA (ICLR 2024).
https://ailab-cvc.github.io/seed
Other
515 stars 30 forks source link

About add the quantized image tokens to pretrained language tokenizer. #21

Open Jiushanhuadao opened 5 months ago

Jiushanhuadao commented 5 months ago

I checked the predict code and paper. It seems you added the quantized image tokens to pretrained language tokenizer. In other papers, Some people separate the tokenizer of language and images, and the image feature are concatenated with the embedding of language through a linear layer. Have you tried this method?

geyuying commented 4 months ago

We added the quantized image tokens to pretrained language tokenizer to unify the representation of image and text tokens, and the LLM is trained to optimize the visual embeddings. We did not try the latter method.