Our paper "A Neural Divide-and-Conquer Reasoning Framework for Image Retrieval from Linguistically Complex Text" have been accepted to ACL 2023 Main Conference. This project aims to introduce the divide-and-conquer and neural-symbolic reasoning appraoches to handle the complex text-image reasoning problem.
Basic Setting
Python==3.7.0 (>=)
torch==1.10.1+cu111,
torchaudio==0.10.1+cu111,
torchvision==0.11.2+cu111,
transformers==4.18.0
unzip src_transformers.zip, volta_src.zip, CLIP.zip to the current home path. In addition, you may need download the image source from the ImageCode to the /data/game/. We release the pre-training checkpoint about the phase 1: proposition generator and the pretraining OFA checkpoint in the Huggingface repository: https://huggingface.co/YunxinLi/pretrain_BART_generator_coldstart_OFA
Prepare the OFA-version Transformer
git clone --single-branch --branch feature/add_transformers https://github.com/OFA-Sys/OFA.git
pip install OFA/transformers/
git clone https://huggingface.co/OFA-Sys/OFA-large
python OFA_encoder_Divide_and_Conquer.py --lr 3e-5 --lr_head 4e-5 -b 32 -m ViT-B/16 -a gelu --logit_scale 1000 --add_input True --positional --frozen_clip
Thanks everyone for your contributions. If you like our work and use it in your projects, please cite our work:
@article{li2023neural,
title={A Neural Divide-and-Conquer Reasoning Framework for Image Retrieval from Linguistically Complex Text},
author={Li, Yunxin and Hu, Baotian and Ding, Yunxin and Ma, Lin and Zhang, Min},
journal={arXiv preprint arXiv:2305.02265},
year={2023}
}