stanfordnlp / stanza

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages
https://stanfordnlp.github.io/stanza/
Other
7.22k stars 886 forks source link

[QUESTION] Language-specific CoreNLPClient with custom properties #931

Open pleyad opened 2 years ago

pleyad commented 2 years ago

I am trying to create a CoreNLPClient instance based on the models for German, but with the parameter tokenize.whitespace = true. It should perform POS-Tagging for German Text.

In the documentation (https://stanfordnlp.github.io/stanza/client_properties.html#corenlp-server-start-options-pipeline) it is specified that one could either give the constructor a string for the language or a dictionary with the properties, but not both at the same time.

Am I missing something, is there a way to both specify language models and properties?

AngledLuffa commented 2 years ago

It certainly appears there is no way to mix the properties using just the python interface. What you can do is look in the models jar file for the StanfordCoreNLP-german.properties file, extract it using the jar command, then make your own copy with the desired changes. Does that work for you?

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 2 years ago

This issue has been automatically closed due to inactivity.