stanfordnlp / stanza

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages
https://stanfordnlp.github.io/stanza/
Other
7.14k stars 880 forks source link

Con oracle #1391

Closed AngledLuffa closed 1 month ago

AngledLuffa commented 1 month ago

Add an oracle for the in-order-compound transition scheme, along with extensive upgrades to the in-order oracle (although the accuracy gainz are of course not earth-shattering) and some instrumentation such as the ability to add dummy oracle transitions which log the misses in the in-order system.

As incidental changes, include a script to go through the current formatting of conparse results (not expected to change) and some scripts to experiment with different data divisions for building the ensemble. Spoiler: none of the ensemble mechanisms actually helped improve results over just averaging 5 models together