yzhangcs / crfpar

[ACL'20, IJCAI'20] Code for "Efficient Second-Order TreeCRF for Neural Dependency Parsing" and "Fast and Accurate Neural CRF Constituency Parsing".
https://www.aclweb.org/anthology/2020.acl-main.302
MIT License
76 stars 7 forks source link

Reproducing results from the paper (Universal Dependencies) #2

Closed andreasgrv closed 4 years ago

andreasgrv commented 4 years ago

Hi, interesting paper and thank you for making this well structured code available.

I'm having trouble figuring out what to run to reproduce your results. More specifically, I've managed to load the Universal Dependencies dataset (v2.2) after making some changes to the corpus loading code, but I still have the following question:

Thanks in advance!

yzhangcs commented 4 years ago

Hi, thanks for your attention. Currently I'm working to release the code as a python package, and the implementation of second-order model (i.e., CRF2O) has not been imported to this repository. If you would like to try it for now, please refer to my another repo yzhangcs/parser and switch to the release branch. I haved published the links of some pretrained models, you can download them and do some predictions & evaluations:

>>> parser = Parser.load('crf-dep-en', verbose=False)
>>> parser.evaluate('parser/data/ptb/test.conllx')
(0.1680757158568927, UCM: 61.75% LCM: 50.83% UAS: 96.11% LAS: 94.50%)
>>> parser = Parser.load('crf2o-dep-en', verbose=False)
>>> parser.evaluate('parser/data/ptb/test.conllx')
(0.1394959150680474, UCM: 62.58% LCM: 51.24% UAS: 96.12% LAS: 94.53%)

To train the parsers from scratch, the commands are:

python -m supar.cmds.crf_dependency   train -b -d 0 -p <path>/model -f char --train <train-file> --dev <dev-file> --test <test-file> --embed <embedding-file> --unk <unk-in-embeddings> --mbr --proj
python -m supar.cmds.crf2o_dependency train -b -d 0 -p <path>/model -f char --train <train-file> --dev <dev-file> --test <test-file> --embed <embedding-file> --unk <unk-in-embeddings> --mbr --proj 

Feel free to ask me if you meet any issues.