This repo contains codes for the following paper:
Jingfeng Yang, Federico Fancellu, Bonnie Webber, Diyi Yang: Frustratingly Simple but Surprisingly Strong: Using Language-Independent Features for Zero-shot Cross-lingual Semantic Parsing. (EMNLP'2021)
If you would like to refer to it, please cite the paper mentioned above.
These instructions will get you running the codes.
Before experiments, you should download data from Parallel Meaning Bank (PMB). We did experiments on PMB 2.1 and PMB 3.0 in the paper. On PMB 2.1, to fairly compare 6 semnatic parsers including those coarse2fine parsers, output are tree-structured sequences or corresponding linearization of DRS with some loss of information. On PMB 3.0, output are linearization of DRS without loss of information.
You also need to use UDPipe to get UPOS and UD features.
Coarse2fine semantic parsing code are in coarse2fine-models folder. Detailed running instructions are recorded in coarse2fine parsing page.
Code are in N-models folder. Bash scripts in N-models/src/opennmt_scripts will help you run all experiments.
For example, if you want to run LSTM semantic parsing with sequential decoding where Universal Dependency are used as features, Run
bash preprocess-dep.sh &&
bash train.sh &&
bash parse-dep.sh
LSTM and Transformer semantic parsing with sequential decoding relied on OpenNMT, where DRS parsers are adapted from Neural DRS. Since original OpenNMT support word-level extra features, original OpenNMT repo can be reused.
XLM_R encoder semantic parsing with sequential decoding relies on fairseq. Because original fairseq did not support extra word-level features, we adapted fairseq code, which is in N-models/fairseq. Also, we adapted it to use different learning rate in encoder and decoder side (because encoder is initialised with pretrained multilingual model while decoder is randomly initialized.)
Code is adapted from OpenNMT, fairseq, Neural DRS and EncDecDRSparsing.