Official implementation of SEED-LLaMA (ICLR 2024).
515 stars 30 forks source link

More explanation about pilot experiments #11

Closed zheedong closed 7 months ago

zheedong commented 7 months ago


I would like to know more details about 'pilot experiments' in SEED v1.

"We conduct two experiments to respectively align discrete representations of VQ-VAE and Beit v2 with OPT2.7B [19] model on CC3M [20] dataset."

In here, what do you mean about 'align'? Do you finetuning OPT 2.7B with VQ-VQE tokens? Or you mean any adaptor between VQ-VAE and OPT? It is great if you tell me more about details.

Thank you.

geyuying commented 7 months ago

We freeze OPT 2.7B and train the projection layer, which takes the discrete representations of VQ-VAE and Beit v2 as inputs, using caption loss. Specifically, the discrete representations are fed into the project layer as the inputs of OPT 2.7B.