Switch to Universal Dependencies (UD)

ahalterman commented 8 years ago

Petrarch2 currently codes only English language articles in the Stanford dependencies format. Other languages, including the Spanish and Arabic that we need to code, are parsed with different tags. We should consider switching Petrarch2's internals to using Universal Dependencies, and then convert the input to universal dependencies. This will avoid having to have separate Petr2s for each language.

CoreNLP itself has also switched to UD in the most recent versions: http://nlp.stanford.edu/software/stanford-dependencies.shtml

Information on Universal Dependencies is here: http://universaldependencies.org/

ahalterman commented 8 years ago

Based on various email conversations, it seems like UD is the way to go. This will consist of two tasks:

[ ] Write a program for each language to convert language-specific tags to UD tags
[ ] Re-do Petrarch2 internals to use UDs

johnb30 commented 7 years ago

Closing because this is a new, separate project: https://github.com/openeventdata/UniversalPetrarch

openeventdata / petrarch2

Switch to Universal Dependencies (UD) #7