for LlaVA output of image_hidden_states not possible?

7AtAri commented 1 month ago

System Info

ubuntu, containerized development

Who can help?

No response

Information

[ ] The official example scripts
[X] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

https://stackoverflow.com/questions/78631255/how-to-extract-image-hidden-states-in-llavas-transformers-huggingface-impleme

apparently others than myself have already had this issue as well...

Expected behavior

should output the image_hidden_states but returns a none type object without any content

zucchini-nlp commented 1 month ago

Hey @7AtAri ! Yes, unfortunately VLMs currently don't return image_hidden_states, and I had it in my TODO list. I might take longer to make a PR for that, so if you want to give it a try a PR is welcome :)

7AtAri commented 1 month ago

thanks! I used a forward hook meanwhile, but having a method is definitely much more convenient...

zucchini-nlp commented 1 month ago

The model now returns hidden states for images after a forward pass (see linked PR 😉 ). But it will not work for generate as generate() usually doesn't return extra outputs from the model which are too model-specific

huggingface / transformers