Open GeorgeS2019 opened 8 months ago
The data used to train the Stanza tagger was
ud-treebanks-v2.13/UD_German-GSD/de_gsd-ud-train.conllu
where keine
is treated as DET
The CoreNLP tagger has not been retrained since UD 2.4, where the standard was to treat keine
as PRON
Retraining taggers with updated data is less of a hassle than the general feature adds you've been requesting, so, we'll put updated data for some of those models on the list
@AngledLuffa
I have tried to connect to @manning through Linkedin regarding CoreNLP 4.5.6 with specific interest on German model 4.5.6
@AngledLuffa
I also have issue with the result of dependency parsing. Hopefully, this will go away when the German POS assignment is correct.
@AngledLuffa I am comparing the CoreNLP German output through code with that of Stanza. I understand that CoreNLP run online is no longer running. It will take extra few steps to compare between CoreNLP 4.5.6 and the latest Stanza.
I mean, you'd probably have better luck just running these things locally and looking at the results, but thank you for informing us of the demo program's demise. I have kicked it.
@AngledLuffa
Does german parser in CoreNLP support XPOS? I can ONLY find UPOS
props.setProperty("annotators", "tokenize, ssplit, mwt, pos, lemma, ner, depparse");
Correct, UPOS only
Stanza states keine as DET CoreNLP 4.5.6 (with corresponding 4.5.6 German model) states keine as PRON