Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention

This PyTorch package implements the KEAR model that surpasses human on the CommonsenseQA benchmark, as described in:

Yichong Xu, Chenguang Zhu, Shuohang Wang, Siqi Sun, Hao Cheng, Xiaodong Liu, Jianfeng Gao, Pengcheng He, Michael Zeng and Xuedong Huang
Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
The 31st International Joint Conference on Artificial Intelligence (IJCAI), 2022.

The package also includes codes for our earilier DEKCOR model as in:

Yichong Xu∗, Chenguang Zhu∗, Ruochen Xu, Yang Liu, Michael Zeng and Xuedong Huang
Fusing Context Into Knowledge Graph for Commonsense Question Answering
Findings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL), 2021

Please cite the above papers if you use this code.

Results

This package achieves the state-of-art performance of 86.1% (single model), 89.4% (ensemble) on the CommonsenseQA leaderboard, surpassing the human performance of 88.9%.

Quickstart

pull docker:
> docker pull yichongx/csqa:human_parity
run docker
> nvidia-docker run -it --mount src='/',target=/workspace/,type=bind yichongx/csqa:human_parity /bin/bash
> cd /workspace/path/to/repo
Please refer to the following link if you first use docker: https://docs.docker.com/

Features

Our code supports flexible training of various models on multiple choice QA.

Distributed training with Pytorch native DDP or Deepspeed: see bash/task_train.sh
Pause and resume training at any step; use option --continue_train
Use any transformer encoders including ELECTRA, DeBERTa, ALBERT

Preprocessing data

Pre-processed data is located at data/.

We release codes for knowledge graph and dictionary external attention in preprocess/

Download data
> cd preprocess
> bash download_data.sh
Add ConceptNet triples and Wiktionary definitions to data
> python add_knowledge.py
We also add the most frequent relations in each question as a side information.
> python add_freq_rel.py

Training and Prediction

train a model
> bash bash/task_train.sh
make prediction
> bash bash/task_predict.sh See task.py for available options.

Running codes for DEKCOR

The current code is mostly compatible to run DEKCOR. To run the original DEKCOR code, please checkout tag DEKCOR to use the previous version.

by Yichong Xu
yicxu@microsoft.com

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

microsoft / KEAR

readme