Closed mehamednews closed 1 week ago
@mehamednews good question! Yes, Qwen2-VL can be applied in a zero-shot manner. However, it won't perform optimally, as the Qwen2-VL model weights were not trained specifically for this approach, making it less compatible.
To address this, we plan to train the model to better adapt to this mode. We'll continue updating our repository and include this masking strategy in future iterations.
thank you for your work (and for sharing this with us) I'm using qwen2-vl for document question answering and I'm wondering if I can apply your token reduction? does it require changing the weights or can I just create a function that accepts a mask for the tokens to keep? I'm not very proficient when it comes to python so any help would be appreciated.