stanfordnlp / CoreNLP

CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
http://stanfordnlp.github.io/CoreNLP/
GNU General Public License v3.0
9.63k stars 2.7k forks source link

Annotate func will be very very slow if input a wrong propertity #1109

Closed eliasyin closed 3 years ago

eliasyin commented 3 years ago

In my Case, if I input a wrong property paramater, the annotate func will be very slow. I think a wrong param detection should be added

import re
from stanfordcorenlp import StanfordCoreNLP
nlps = StanfordCoreNLP(
    r'./stanford-corenlp-full-2018-02-27')
text = 'golf and rugby union sevens get the nod for rio olympics in 2016 : • tiger woods says he hopes to compete in bra . .'
text = re.sub(r'\. ', ' .', text).strip()
text = re.sub(r' {2,}', ' ', text)
nlp_properties = {
    'annotatiors' : "depparse",   
# Right param is :   'annotators' : "depparse", 
    'tokenize.whitespace' : True,
    'ssplit.isOneSentence' : False,
    'outputFromat' : 'json'
}
nlps.annotate(text.strip(), nlp_properties)
AngledLuffa commented 3 years ago

It's non-trivial to add error checking, since user annotators could always need otherwise unknown properties

On Thu, Nov 12, 2020 at 12:12 AM eliasyin notifications@github.com wrote:

In my Case, if I input a wrong property paramater, the annotate func will be very slow. I think a wrong param detection should be added

import re from stanfordcorenlp import StanfordCoreNLP nlps = StanfordCoreNLP(

r'./stanford-corenlp-full-2018-02-27')

text = 'golf and rugby union sevens get the nod for rio olympics in 2016 : • tiger woods says he hopes to compete in bra . .' text = re.sub(r'. ', ' .', text).strip() text = re.sub(r' {2,}', ' ', text) nlp_properties = {

'annotatiors' : "depparse",

Right param is : 'annotators' : "depparse",

'tokenize.whitespace' : True,

'ssplit.isOneSentence' : False,

'outputFromat' : 'json'

} nlps.annotate(text.strip(), nlp_properties)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/CoreNLP/issues/1109, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AYWIVCFDARKH5KMUP5M3SPOKFLANCNFSM4TS6SEKA .

eliasyin commented 3 years ago

fine