Open lucasjinreal opened 3 weeks ago
Thank you for your interest in our work. In our paper's Figure 1, we have a baseline model both training & inference with 64 tokens. Given both query transformer and Resampler employ similar cross-attention modules, I believe you can use this number for some reference only.
Hi, looks like MQT training still with a Maixum token num say 256, and then inference can choose any tokens num. But how does it compar with a Resampler, training & inference with only say 64 tokens?