nusnlp / esr

GNU General Public License v3.0
11 stars 4 forks source link

Improved Word Sense Disambiguation with Enhanced Sense Representations

PWC

This repository contains codes and scripts to build enhanced sense representations for word sense disambiguation.

If you use this code for your work, please cite this paper:

@inproceedings{song-etal-2021-improved-word,
    title = "Improved Word Sense Disambiguation with Enhanced Sense Representations",
    author = "Song, Yang  and
      Ong, Xin Cai  and
      Ng, Hwee Tou  and
      Lin, Qian",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
    year = "2021",
    url = "https://aclanthology.org/2021.findings-emnlp.365",
    pages = "4311--4320"
}

Requirements

Downloading Datasets

You need to download the following datasets:

Setting up variables

You need to modify script/config.sh according to your environment. Set data variable to the top directory where all the datasets are stored.

Processing FEWS

bash experiment/fews/run.sh

Using trained models

You can train the models from scratch. Alternatively, you can use our trained models.

Running Experiments

For ESR on SemCor with roberta-base:

bash experiment/esr/roberta-base/dataset_semcor/sd_42/run.sh

For ESR on SemCor with roberta-large:

bash experiment/esr/roberta-large/dataset_semcor/sd_42/run.sh

For ESR on SemCor and WNGC with roberta-base:

bash experiment/esr/roberta-base/dataset_semcor_wngc/sd_42/run.sh

For ESR on SemCor and WNGC with roberta-large:

bash experiment/esr/roberta-large/dataset_semcor_wngc/sd_42/run.sh

For ESR on FEWS with roberta-base:

bash experiment/esr/roberta-base/dataset_fews/sd_42/run.sh

For ESR on FEWS with roberta-large:

bash experiment/esr/roberta-large/dataset_fews/sd_42/run.sh