stanfordnlp / stanza

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages
https://stanfordnlp.github.io/stanza/
Other
7.14k stars 880 forks source link

Con silver #1348

Closed AngledLuffa closed 4 months ago

AngledLuffa commented 4 months ago

Add some mechanisms for building and manipulating a silver dataset for the constituency parser. Filtering the trees by number of matching parsers seems to help make a better silver dataset, whereas filtering by variance does not. Will continue experimenting