TempleX98 / MoVA

[NeurIPS 2024] MoVA: Adapting Mixture of Vision Experts to Multimodal Context
Apache License 2.0
129 stars 1 forks source link

evaluation too slow #5

Open dacian7 opened 3 months ago

dacian7 commented 3 months ago

When evaluating VQA benchmarks, the generation time for one question is tens of times slower than LLaVA inference. Intuitively it should be comparable to LLaVA. Do you have any ideas of the problem? Thanks

TempleX98 commented 3 months ago

Do you specify the eos_token_id=tokenizer.eos_token_id argument for the generation function model.generate?

dacian7 commented 3 months ago

I just identified that, Image.open is extremely slow (30s for one image). It is weird

On Thu, Jul 25, 2024 at 11:26 PM TempleX @.***> wrote:

Do you specify the eos_token_id=tokenizer.eos_token_id argument for the generation function model.generate?

— Reply to this email directly, view it on GitHub https://github.com/TempleX98/MoVA/issues/5#issuecomment-2252062099, or unsubscribe https://github.com/notifications/unsubscribe-auth/AZT5FS2VWA33FS3LWL5OSFLZOHT2XAVCNFSM6AAAAABLP3REKKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJSGA3DEMBZHE . You are receiving this because you authored the thread.Message ID: @.***>