Open icrto opened 7 months ago
You can install Llava from their original repo: https://github.com/haotian-liu/LLaVA Does that work for you?
Yes, that works for me. However, I have not been able to reproduce the results in the paper, and as you mentioned here you were using another version of llava, I thought it might have to do with that, hence my request for the specific package versions you used.
As an example, on Fast Open-Ended MiniImageNet with LLaVA-Next-7B with 2 shots and a detailed description you report (in table 47) an accuracy of 33.67 ± 2.25 while I obtain 14.0. On Operator Induction with:
(This is after I remove the truncation as mentioned in the link.)
Sorry for the late reply. I just re-run Llava from their latest code and I can reproduce the reported accuracies with marginal difference. I don't have a very clear idea of why there is a huge differences. I'll aim to refactor Llava-next to the Huggingface implementation soon for a more stable reproduction.
Could you please provide the requirements.txt for llava?
Thanks!