louismullie / stanford-core-nlp

Ruby bindings to the Stanford Core NLP tools (English, French, German).
Other
432 stars 70 forks source link

Write instructions on how to use the gem with the latest version of the Stanford CoreNLP. #25

Closed mgurley closed 10 years ago

mgurley commented 10 years ago

Version 3.3.1 of the Stanford CoreNLP has an option 'ssplit.newlineIsSentenceBreak' that can be set on the ssplit annotator (WordToSentenceAnnotator). See http://nlp.stanford.edu/software/corenlp.shtml. This option is not available in the version of Stanford CoreNLP in the packages linked to on the README file. To use this option it is necessary to be able to set custom options on an annotator (see #20) and to be able to use the latest version of the Stanford CoreNLP.

louismullie commented 10 years ago

Is this included in your PR?

mgurley commented 10 years ago

Yes, I added it to the bottom of the README file beneath the Testing section. Let me know if anything should be changed.

Also I have written a rake task for my own project that downloads the latest version of the Stanford CoreNLP library and unzips it and downloads your bridge.jar file. The rake task has a hard coded destination directory but that could be made configurable via an argument. It also does not download any non-English POS taggers. Let me know if have any interest in adding something similar to this project. So you would not need to host the packages yourself. Here is a link to the rake task: https://github.com/NUBIC/abstractor/blob/master/lib/tasks/abstractor_tasks.rake

louismullie commented 10 years ago

That rake task would be great to have.