TencentARC / GVT

Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".
Apache License 2.0
54 stars 0 forks source link

More details about tuning the visual tokenizer? #8

Open YAOYI626 opened 1 year ago

YAOYI626 commented 1 year ago

Thanks authors for the insightful work!

I want to understand more details about tuning the visual tokenizer. Could you mind explaining about what kind of dataset used in training your own visual tokenizer?

It will be supe helpful to us! Thanks in advance.

daoyuan98 commented 1 year ago

Hi, Thank you for your interest in our work!

For visual tokenizer distillation, we followed the protocol in FD and performed the feature distillation on ImageNet-1K dataset.