airsplay / lxmert

PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".
MIT License
923 stars 157 forks source link

How to extract the CrossModality Output #27

Closed 256785 closed 4 years ago

256785 commented 4 years ago

I want to use the CrossModality , how do I extract it

airsplay commented 4 years ago

For images in MS COCO & Visual Genome, loading the features and running the functions in src::lxrt::entry would give the output.

For raw images, the features need to be extracted first here.