Closed AstraliteHeart closed 1 year ago
cc @younesbelkada
Hey @AstraliteHeart π This issue seems to be a duplicate of https://github.com/huggingface/transformers/issues/21599, which is fixed.
Can I ask you to try to run your script using transformers
main
branch, i.e. after installing with pip install --upgrade git+https://github.com/huggingface/transformers.git
?
I don't think this is a duplicate, my env is past that fix (see p4 in the original repro steps), I've updated form main
to confirm as follows:
pip install --upgrade git+https://github.com/huggingface/transformers.git
Resolved https://github.com/huggingface/transformers.git to commit bb5a2f2fc30985841289207b9f1f7765d8abc4e0
python test_simple.py
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 25 but got size 5 for tensor number 1 in the list.
Thank you for confirming @AstraliteHeart π€ I will dig deeper and let you know what I find!
After some digging, we can see that the exception is raised as follows:
β /home/joao/hf/lib/python3.10/site-packages/lavis/models/blip2_models/modeling_opt.py:703 in β
β forward β
β β
β 700 β β β inputs_embeds = self.embed_tokens(input_ids) β
β 701 β β β
β 702 β β if query_embeds is not None: β
β β± 703 β β β inputs_embeds = torch.cat([query_embeds, inputs_embeds], dim=1) β
β 704 β β β input_shape = inputs_embeds.size()[:-1] β
β 705 β β β
β 706 β β # embed positions β
β°βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ―
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 25 but got size 5 for tensor number 1 in the list.
From the full stack trace, we can conclude that the error arises from an issue in lavis
, and not in transformers
:) Actually, the root cause for this issue is something that we have addressed on this PR -- lavis
has a different implementation, where they have a modified OPT model to handle the image embeddings, where we decided to update .generate()
to handle soft-prompting.
@AstraliteHeart This means you have two options:
transformers
, as opposed to lavis
. See here for examples.lavis
, so they can help you with this issue :) @gante thank you for debugging!
I can confirm that syncing before https://github.com/huggingface/transformers/pull/21405 (edc1e734bfc01109b8c66881d950ebbda032a6d2) works, I'll open an issue on SF side to warn them about the breakage, unfortunately this brings me to the original issue of trying to use convert_blip_2_original_to_pytorch.py
, perhaps you can help me figure out how the BLIP2 models were converted? (I understand, this is irrelevant to most users but only a few brave souls who are finetuning BLIP2 via LAVIS but want to then load it in HF.)
I've tried both pip install git+https://github.com/nielsrogge/LAVIS.git@fix_lavis
(mentioned in the script) and lavis
from HEAD, but I am getting this trace
$ python ./convert_blip_2_original_to_pytorch.py
Loading original model...
Position interpolate from 16x16 to 26x26
tokenizer facebook/opt-6.7b
Loading checkpoint shards: Done!
Traceback (most recent call last):
File "./convert_blip_2_original_to_pytorch.py", line 304, in <module>
convert_blip2_checkpoint(args.model_name, args.pytorch_dump_folder_path, args.push_to_hub)
File "/.../envs/lavis/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "./convert_blip_2_original_to_pytorch.py", line 216, in convert_blip2_checkpoint
original_logits = original_logits.logits
AttributeError: 'dict' object has no attribute 'logits' // indeed, this is a dictionary containing only 'loss'
what combination of versions of transformers
and lavis
was used during conversion?
Hi,
Thanks for converting BLIP2 to HF :) I actually forked the LAVIS repo and made some tweaks to facilitate conversion (I removed a bunch of unnecessary requirements etc). See here.
Hi Niels, thank you for checking this.
I did use your fork (or so I thought, sigh), but I redid everything from scratch while comparing traces with code and, well... turned out I moved my blip2 conversion script to LAVIS git root folder which kept including their model (as it's in the lavis
folder) even with your fixed one being installed (so I do apologies).
I can now confirm that with your fork I was able to convert my model with snapshot before https://github.com/huggingface/transformers/pull/21405 and load it it in 8 bits with latest bitsandbytes
keeping VRAM usage at 11.1GB (vs around 18.5GB without).
Do you have any guidance on matching outputs between lavis and hf models? I ran about 50 samples though lavis/hf16/hf8 and while hf16 and hf8 are mostly consistent (good), lavis output is better in all cases. (see anecdotal examples below)
Here is roughly how I load and run all models (https://gist.github.com/AstraliteHeart/4d7ebf834021b8e1c9bc439c1633002c) I tried to make sure all settings and rnd seeds are matching, but perhaps I am missing something?
https://derpicdn.net/img/view/2023/2/23/3051871.png
'caption_lavis': ['scootaloo, apple bloom, and applejack in a group hug scootaloo, apple bloom, and applejack are all smiling white background', 'scootaloo, applebloom, and applejack in a group hug scootaloo and applebloom are jumping applejack is smiling white background', 'scootaloo, apple bloom, and applejack in a group hug scootaloo, apple bloom, and applejack are jumping and smiling white background'],
'caption_hf_16': ['a series of images of sweetie belle, applejack, scootaloo, applebloom, rarity, pinkie pie, twilight sparkle, rarity, twilight sparkle, rarity, rarity, rarity, rarity, rarity, rarity', 'a series of images of sweetie belle, applejack, scootaloo, applebloom, rarity, pinkie pie, twilight sparkle, rarity, twilight sparkle, twilight sparkle, twilight sparkle, twilight sparkle', 'a series of images of sweetie belle, applejack, scootaloo, applebloom, rarity, pinkie pie, twilight sparkle, rarity, twilight sparkle, twilight sparkle, rarity, rarity, rarity, rarity'],
'caption_hf_8': ['a series of images of sweetie belle, applebloom, scootaloo, applejack, rarity, pinkie pie, twilight sparkle, fluttershy, rarity, pinkie pie, twilight sparkle, rarity, pink', 'a series of images of sweetie belle, applebloom, scootaloo, applejack, rarity, pinkie pie, twilight sparkle, fluttershy, rarity, pinkie pie, twilight sparkle, twilight sparkle', 'a series of images of sweetie belle, applebloom, scootaloo, applejack, rarity, pinkie pie, twilight sparkle, fluttershy, rarity, twilight sparkle, twilight sparkle, twilight sparkle']
https://derpicdn.net/img/2017/7/7/1480500/large.png
'caption_lavis': ['alicorn twilight sparkle is laying on her back with a book on her head and a book on her chest she is surrounded by books on the floor and on the walls she has a book on her head and a book on her chest she is', 'alicorn twilight sparkle is laying on her back with a book on her head and a book on her chest she is surrounded by books on the floor and on the walls she is also wearing a book on her head and a book on her chest', 'alicorn twilight sparkle is laying on her back with a book on her head and a book on her chest she is surrounded by books on the floor and on the walls she has a book on her head and a book on her chest she has'],
'caption_hf_16': ['posterior view of twilight sparkle lying on the floor surrounded by a pile of books, surrounded by a pile of books, surrounded by a pile of books, surrounded by a pile of books, surrounded by a pile of books, surrounded by', 'posterior view of twilight sparkle lying on the floor surrounded by a pile of books, surrounded by a pile of books, surrounded by a pile of books, surrounded by a pile of books, surrounded by a pile of books\n', 'posterior view of twilight sparkle lying on the floor surrounded by a pile of books, surrounded by a pile of books, surrounded by books, surrounded by books, surrounded by books, surrounded by books, surrounded by books, surrounded by books'],
'caption_hf_8': ['twilight sparkle is lying on the floor surrounded by a pile of books she is surrounded by a pile of books on the floor she is surrounded by a pile of books on the floor she is surrounded by a pile of books on the floor she','twilight sparkle is lying on the floor surrounded by a pile of books she is surrounded by a pile of books on top of her she is surrounded by a pile of books on top of her she is surrounded by a pile of books on top', 'twilight sparkle is lying on the floor surrounded by a pile of books she is surrounded by a pile of books on the floor she is surrounded by a pile of books on the floor she is surrounded by a pile of books on the floor her']
Thanks for reporting, that should not be the case! I extensively tested the greedy/beam search outputs on original vs my implementation to make sure everything works as expected.
But the generate method has had some updates now so there might be a small issue. However isn't it weird that the first token is already different? cc'ing @gante here
Also I'm not sure you can run both LAVIS and Transformers main branch in the same environment to compare, cause LAVIS relies on an older version of Transformers
Results on top are from transformers
https://gist.github.com/AstraliteHeart/4d7ebf834021b8e1c9bc439c1633002c + your fork of lavis
.
Some more tests (tldr, latest transformers still do not produce the same output)
Official lavis
repo:
['scootaloo, apple bloom, and applejack in a group hug scootaloo, apple bloom, and applejack are all smiling white background', 'scootaloo, applebloom, and applejack in a group hug scootaloo and applebloom are jumping applejack is smiling white background', 'scootaloo, apple bloom, and applejack in a group hug scootaloo, apple bloom, and applejack are jumping and smiling white background']
['alicorn twilight sparkle is laying on her back with a book on her head and a book on her chest she is surrounded by books on the floor and on the walls she has a book on her head and a book on her chest she is', 'alicorn twilight sparkle is laying on her back with a book on her head and a book on her chest she is surrounded by books on the floor and on the walls she is also wearing a book on her head and a book on her chest', 'alicorn twilight sparkle is laying on her back with a book on her head and a book on her chest she is surrounded by books on the floor and on the walls she has a book on her head and a book on her chest she has']
Latest transformers:
'caption_hf_16': [
'a series of images of sweetie belle, applejack, scootaloo, applebloom, rarity, pinkie pie, twilight sparkle, rarity, twilight sparkle, rarity, rarity, rarity, rarity, rarity, rarity',
'a series of images of sweetie belle, applejack, scootaloo, applebloom, rarity, pinkie pie, twilight sparkle, rarity, twilight sparkle, twilight sparkle, twilight sparkle, twilight sparkle',
'a series of images of sweetie belle, applejack, scootaloo, applebloom, rarity, pinkie pie, twilight sparkle, rarity, twilight sparkle, twilight sparkle, rarity, rarity, rarity, rarity'
],
'caption_hf_8': [
'a series of images of sweetie belle, applebloom, scootaloo, applejack, rarity, pinkie pie, twilight sparkle, fluttershy, rarity, pinkie pie, twilight sparkle, rarity, pink',
'a series of images of sweetie belle, applebloom, scootaloo, applejack, rarity, pinkie pie, twilight sparkle, fluttershy, rarity, pinkie pie, twilight sparkle, twilight sparkle',
'a series of images of sweetie belle, applebloom, scootaloo, applejack, rarity, pinkie pie, twilight sparkle, fluttershy, rarity, twilight sparkle, twilight sparkle, twilight sparkle'
]
caption_hf_16': [
'posterior view of twilight sparkle lying on the floor surrounded by a pile of books, surrounded by a pile of books, surrounded by a pile of books, surrounded by a pile of books, surrounded by a pile of books, surrounded by',
'posterior view of twilight sparkle lying on the floor surrounded by a pile of books, surrounded by a pile of books, surrounded by a pile of books, surrounded by a pile of books, surrounded by a pile of books\n',
'posterior view of twilight sparkle lying on the floor surrounded by a pile of books, surrounded by a pile of books, surrounded by books, surrounded by books, surrounded by books, surrounded by books, surrounded by books, surrounded by books'
],
'caption_hf_8': [
'twilight sparkle is lying on the floor surrounded by a pile of books she is surrounded by a pile of books on the floor she is surrounded by a pile of books on the floor she is surrounded by a pile of books on the floor she',
'twilight sparkle is lying on the floor surrounded by a pile of books she is surrounded by a pile of books on top of her she is surrounded by a pile of books on top of her she is surrounded by a pile of books on top',
'twilight sparkle is lying on the floor surrounded by a pile of books she is surrounded by a pile of books on the floor she is surrounded by a pile of books on the floor she is surrounded by a pile of books on the floor her'
]
Hey @AstraliteHeart π Differences in generation can be explained by many parts of the stack, from ninja numerical bugs to intentional implementation quirks. Debugging the exact cause takes time, so I want to ask for your help :D
lavis
and transformers
are recent versions? (latest release or newer)This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
u have any guidance on matching outputs between lav
Can you please help how you managed to convert this? I am also stuck is there any specific transformers version?
I have a PR here which aims to further verify equivalence: https://github.com/huggingface/transformers/pull/24854.
The conversion script can be found here and can be run as follows:
pip install -U git+https://github.com/nielsrogge/LAVIS.git@blip2_float32
git clone -b improve_blip2 git+https://github.com/nielsrogge/transformers.git
cd transformers
python src/transformers/models/blip_2/convert_blip_2_original_to_pytorch.py --model_name "blip2-flan-t5-xl"
The reason I forked LAVIS is to make sure I can compare both implementations using float32.
System Info
working:
transformers
version: 4.26.1broken:
transformers
version: 4.27.0.dev0Who can help?
@gante @NielsRogge
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
python test_simple.py
, model is correctly loaded and prints a captionpip install --upgrade git+https://github.com/huggingface/transformers
(I wanted the new shiny blip2 conversion script so I can conver my finetuned model into HF format)Resolved https://github.com/huggingface/transformers to commit 8b3db33a763ccef828fca89bac7e6cbff314f131
python test_simple.py
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 25 but got size 5 for tensor number 1 in the list.
Expected behavior
Can use BLIP2 with latest HF