First of all: Thanks a lot for open-sourcing your work!
What this issue is about
While trying to reproduce some of the results mentioned in your paper, I noticed that the pinned version of the transformers package is quite outdated. Unfortunately, the code seems to be incompatible with up-to-date versions in its current state due to the AutoModel and AutoConfig classes encountering a name conflict with LLaVA, and some model-specific masking utilities for BLOOM and OPT.
Suggested changes
rename the AutoModel model_type
use model-independent causal attention mask utility functions
First of all: Thanks a lot for open-sourcing your work!
What this issue is about
While trying to reproduce some of the results mentioned in your paper, I noticed that the pinned version of the transformers package is quite outdated. Unfortunately, the code seems to be incompatible with up-to-date versions in its current state due to the AutoModel and AutoConfig classes encountering a name conflict with LLaVA, and some model-specific masking utilities for BLOOM and OPT.
Suggested changes