Open BiEchi opened 2 months ago
https://github.com/haotian-liu/LLaVA uses padding for pre-processing the images by default. Current transformers implementation does not support that.
Request per @NielsRogge at (https://huggingface.co/llava-hf/llava-1.5-7b-hf/discussions/26#66cf46a5a523b74b5f90fa72).
I successfully reproduced logits after conversion if we add padding in the Transformers library.
Thanks for the request @BiEchi! cc @zucchini-nlp
Feature request
https://github.com/haotian-liu/LLaVA uses padding for pre-processing the images by default. Current transformers implementation does not support that.
Motivation
Request per @NielsRogge at (https://huggingface.co/llava-hf/llava-1.5-7b-hf/discussions/26#66cf46a5a523b74b5f90fa72).
Your contribution
I successfully reproduced logits after conversion if we add padding in the Transformers library.