In this paper, we introduce document-level discourse probing to evaluate the ability of pretrained LMs to capture document-level relations. We experiment with 7 pretrained LMs, 4 languages, and 7 discourse probing tasks, and find BART to be overall the best model at capturing discourse — but only in its encoder.
Fajri Koto, Jey Han Lau, and Timothy Baldwin. Discourse Probing of Pretrained Language Models. In Proceedings of the 20th Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2021), Mexico (virtual).
requirements.txt
. pip install -r requirements.txt
nsp_choice
. To run the code you can see the examples at run.sh
.nsp_choice/data/
.ordering
. To run the code you can see the examples at run.sh
.ordering/data/
.dissent
. To run the code you can see the examples at run.sh
.dissent/data/
. There is no data for ES.rst/prepare_data/extract_chinese_dtb.ipynb
rst
. To run the code you can see the examples at run_nuc.sh
and run_rel.sh
.rst/prepare_data/extract_chinese_dtb.ipynb
, rst/prepare_data/extract_german_dtb.ipynb
, rst/prepare_data/extract_spanish_dtb.ipynb
to extract the data (after downloading the related Discourse Tree Bank).
There is no code provided for extracting EN data.segment
. To run the code you can see the examples at run.sh
.extract_chinese_dtb.ipynb
, extract_german_dtb.ipynb
, extract_spanish_dtb.ipynb
to extract the data (after downloading the related Discourse Tree Bank).
There is no code provided for extracting EN data.cloze
. To run the code you can see the examples at run.sh
.cloze/prepare_data.ipynb
to prepare the data. Some samples are provided at folder cloze/data
After running all the experiments, we provide some post-processing codes:
post_process.ipynb
: to extract mean and standard deviation of all experiments from 3 different runs.plot_across_model.ipynb
: to create Figure 2 in the paper.plot_across_langauges.ipynb
: to create Figure 3 in the paper.