Open bermeitinger-b opened 7 years ago
Are you talking about this: https://stanfordnlp.github.io/CoreNLP/corenlp-server.html ?
Yes. PyCobalt currently uses CoreNLP as the NLP tool for POS-tagging and NER. Running it is clunky. It is cumbersome to start the CoreNLP server even if using docker. With spaCy all code is directly in Python. This benchmark shows the superiority in speed. NER is slightly worse. Without CoreNLP, PyCobalt could be published as a "simple" Python module.
Great Move, I think Spacy is will be much better than CoreNLP. I am eagerly waiting for this update. Please let me know if you need any help.
I'm sorry for raising expectations about the implementation and the timeline. This issue was meant to be a reminder for me, if I have time in the future. This won't be resolved this or next month. We would be happy to accept a pull request, though.
swathimithran notifications@github.com schrieb am Do., 27. Juli 2017, 11:05:
Great Move, I think Spacy is will be much better than CoreNLP. I am eagerly waiting for this update. Please let me know if you need any help.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Lambda-3/PyCobalt/issues/3#issuecomment-318304187, or mute the thread https://github.com/notifications/unsubscribe-auth/AAjeiURYoUvqWdg8XIQ74DuUoFoHtQfTks5sSFLsgaJpZM4OjtYO .
--
Universität Passau Bernhard Bermeitinger, M.Sc. Wissenschaftlicher Mitarbeiter Fakultät für Informatik und Mathematik Lehrstuhl für Informatik mit Schwerpunkt Digital Libraries and Web Information Systems Innstraße 43, ITZ/IH 112 94032 Passau +49-(0)851/509-3394 bernhard.bermeitinger@uni-passau.de http://www.fim.uni-passau.de/digital-libraries/
Starting the CoreNLP server is not nice for anyone, it is big, relatively slow and the usage is a bit clunky. Other options are either spaCy or nltk.
First experiments show that
nltk
's Named Entity Recognition is not very accurate and the sentence splitter is worse than CoreNLP. The next choice isspaCy
which shows nice results from simple experiments. Before we implement, we have to check the following: