Open thedarkzeno opened 1 year ago
Hi, have you started working on the issue? Do you plan to integrate it yourself?
I'd like to work on this issue, is there any documentation on adding new models that I should follow?
I would like to work on this one.
@NielsRogge @alaradirik If no one else is currently working on adding this model, I would like to work on it.
Hi @kumar-devesh , I'm working on it (made some progress toward getting a working version of the Discrete VAE in Torch) but @osanseviero told me that it would be better to verify if there's interest from the development team. If they're ok with it then we could work on it together.
cc @sgugger @amyeroberts
Hi @ChanBong @kumar-devesh @alceballosa, Unified-IO would be a great addition to the library.
If you are not familiar with contributing to transformers, you can refer to the guidelines to get started. I'd recommend checking if you can run the original repo without any issues and get the expected results first.
Here are some summarised points that might help with model addition:
Processor
class that capsulates Tokenizer
and ImageProcessor
classes that preprocesses the text and image inputs.
tests/models/<MODEL_NAME>/
, you can refer to other test files to see what tests to add.Once you are done, you would need to run the following commands to check the PR passes all CI tests:
make style
make quality
make repo-consistency
RUN_SLOW=TRUE pytest tests/models/unifiedio/test_modeling_unifiedio.py
RUN_SLOW=TRUE pytest tests/models/unifiedio/test_image_processor_unifiedio.py
RUN_SLOW=TRUE pytest tests/models/unifiedio/test_tokenizer_unifiedio.py
RUN_SLOW=TRUE pytest tests/models/unifiedio/test_processor_unifiedio.py
We can do an in-depth review or create a Slack channel to address questions and issues once there is a draft PR.
Hope this helps!
Model description
I'd like to request the addition of the Unified-IO model. It is a multimodal model capable of visual question answering, image generation and more... the repo is this: https://github.com/allenai/unified-io-inference the paper: Unified-IO: Sequential Modeling for Generally Applicable Vision Models
Open source status
Provide useful links for the implementation
https://github.com/allenai/unified-io-inference