MIX-Shannon is a project that explores new tasks in the field of multi-modalities, such as Further Referring Expression Comprehension (FREC), Agent Interaction Visual Question Answering (AI-VQA), Cooking-step-images Retrieval (CSI-Recipe),and visual question answering with image (VQAI).
In this section, we demonstrate how to set up environment for our project.
To get started, follow these steps:
Clone the project repository:
git clone https://github.com/IEIT-AGI/MIX-Shannon.git
cd MIX-Shannon
(Optional) Create a conda environment and activate it:
conda create -n MIX-Shannon python=3.8
conda activate MIX-Shannon
Install the required packages:
pip install -r requirements.txt
We have a requirements.txt
file that you can use to install the required packages all at once.
Task | Dataset | data and code |
---|---|---|
Further Referring Expression Comprehension | RefCOCOs CopsRef Talk2Car | FREC |
Agent Interaction Visual Question Answering | AI-VQA | AI-VQA |
Cooking-step-images Retrieval | CSIR | CSI-Recipe |
visual question answering with image | VQAI | VQAI |
Credits and sources are provided throughout this repo.
Special thanks to:
If you have any questions, comments or suggestions, please do not hesitate to contact us at ieitagi001@gmail.com
This project is released under the Apache 2.0 license.