Include other tokenizers/image processors in Llava

huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

https://huggingface.co/transformers

Apache License 2.0

134.18k stars 26.83k forks source link

Include other tokenizers/image processors in Llava #29887

Open muhark opened 7 months ago

muhark commented 7 months ago

Feature request

Generalize the functionality in processing_llava.py to include other tokenizers and image processors.

Motivation

The current implementation of the LlaVA processor only accepts Llama as the tokenizer. Given the extensibility of the framework, this restriction should not be there.

Your contribution

I am happy to write the code and make the PR. We have done this for our own internal research code and there is an example file in our llava-gemma-2b model (link).

ArthurZucker commented 7 months ago

I am not sure there has been other Llava models that don't use llama tokenizer no?

muhark commented 7 months ago

@ArthurZucker thank you for the fast response!

There are a few examples I know of:

TinyLlaVA uses phi-2 (but requires users to install their package)
We have just uploaded a Gemma-based LLaVA model

Is there a strong reason for restricting the LlaVA Processor to using llama/clip?

ArthurZucker commented 7 months ago

There is none! It's just that we base it on usage! Could you link to the gemma based model? Feel free to open a PR

FYI @NielsRogge

muhark commented 7 months ago

Link to gemma-based model: Intel/llava-gemma-2b

I'm happy to open the PR :)

ArthurZucker commented 7 months ago

Feel free to do so and link it to this issue!

lucasjinreal commented 4 months ago

Hello, am just wondering how to append -200 image tokens to nay tokenizer and reuse LLavaProcessor?

ArthurZucker commented 4 months ago

Hey, you can simply use tokenizer.add_tokens(["toke1", "tok2", .........])