stanfordnlp / stanza

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages
https://stanfordnlp.github.io/stanza/
Other
7.25k stars 888 forks source link

Initial attempt to chop up long inputs to a transformer into pieces t… #1350

Closed AngledLuffa closed 7 months ago

AngledLuffa commented 7 months ago

Initial attempt to chop up long inputs to a transformer into pieces that the transformer can digest, even if it isn't necessarily going to give great results for the later tokens in the sentence. Addresses https://github.com/stanfordnlp/stanza/issues/1294