christianscheible / qsample

A natural language processing tool for automatically detecting quotations in text.
http://www.ims.uni-stuttgart.de/data/qsample
15 stars 5 forks source link

Compatibility Issue #4

Open iknoorjobs opened 4 years ago

iknoorjobs commented 4 years ago

Is this only UNIX compatible? I am not able to run this on windows. It shows path syntax problems. How to solve this?

Thanks

iknoorjobs commented 4 years ago

Hello!

I was successfully able to build on Ubuntu. But now, it shows this error.

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at edu.stanford.nlp.ling.tokensregex.parser.TokenSequenceParser.<init>(TokenSequenceParser.java:3446) at edu.stanford.nlp.ling.tokensregex.TokenSequencePattern.getNewEnv(TokenSequencePattern.java:158) at edu.stanford.nlp.pipeline.TokensRegexNERAnnotator.createPatternMatcher(TokensRegexNERAnnotator.java:365) at edu.stanford.nlp.pipeline.TokensRegexNERAnnotator.<init>(TokensRegexNERAnnotator.java:317) at edu.stanford.nlp.pipeline.NERCombinerAnnotator.setUpFineGrainedNER(NERCombinerAnnotator.java:220) at edu.stanford.nlp.pipeline.NERCombinerAnnotator.<init>(NERCombinerAnnotator.java:145) at edu.stanford.nlp.pipeline.AnnotatorImplementations.ner(AnnotatorImplementations.java:68) at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getNamedAnnotators$5(StanfordCoreNLP.java:523) at edu.stanford.nlp.pipeline.StanfordCoreNLP$$Lambda$35/0x0000000840085040.apply(Unknown Source) at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$null$30(StanfordCoreNLP.java:602) at edu.stanford.nlp.pipeline.StanfordCoreNLP$$Lambda$57/0x000000084009a040.get(Unknown Source) at edu.stanford.nlp.util.Lazy$3.compute(Lazy.java:126) at edu.stanford.nlp.util.Lazy.get(Lazy.java:31) at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:149) at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:251) at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:192) at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:188) at ims.cs.corenlp.PARCCoreNlpPipeline.setUpPipeline(PARCCoreNlpPipeline.java:215) at ims.cs.corenlp.PARCCoreNlpPipeline.parseDocumentFromRaw(PARCCoreNlpPipeline.java:228) at ims.cs.corenlp.PARCCoreNlpPipeline.access$000(PARCCoreNlpPipeline.java:52) at ims.cs.corenlp.PARCCoreNlpPipeline$PARCCoreNlpDocumentIterator.next(PARCCoreNlpPipeline.java:101) at ims.cs.corenlp.PARCCoreNlpPipeline$PARCCoreNlpDocumentIterator.next(PARCCoreNlpPipeline.java:63) at ims.cs.parc.ProcessedCorpus.transformDocumentList(ProcessedCorpus.java:67) at ims.cs.parc.ProcessedCorpus.getTest(ProcessedCorpus.java:87) at ims.cs.qsample.run.QSample.main(QSample.java:175)

As it's showing Java OutOfMemoryError, so I even tried to reduce the size of documents in example/documents directory, but still couldn't run.

Please help. Thanks

IonutCiuta commented 4 years ago

Try to change your JVM memory configuration when running it. You can read more here: https://alvinalexander.com/blog/post/java/java-xmx-xms-memory-heap-size-control

I'd start with the heap size.

iknoorjobs commented 4 years ago

Fixed it. Thank you so much.

iknoorjobs commented 4 years ago

Hi,

After running the command:

java -jar target/qsample-0.1-jar-with-dependencies.jar --sample example/documents/ output

It works too slow to get the output. As it is using a pre-trained model in the backend, is there any way we can make the processing faster to get the results quickly?

PS. I guess the coreNLP is taking most of the time to annotate the text?

Thanks

igormis commented 3 years ago

Regarding this, is it possible to execute it as a service with already loaded model? And only infer the quotations for the input text (not file)