Closed GoogleCodeExporter closed 9 years ago
Hi there
This is likely due to the larger number of candidate terms extracted by n-gram
- perhaps 1G memory isn't enough. Can you try one thing:
In AlgorithmTester, lines 70-76 are:
------------------
//Three CandidateTermExtractor are implemented:
//1. An OpenNLP noun phrase extractor that extracts noun phrases as candidate terms
//CandidateTermExtractor npextractor = new NounPhraseExtractorOpenNLP(stop, lemmatizer);
//2. A generic N-gram extractor that extracts n(default is 5, see the property file) grams
CandidateTermExtractor npextractor = new NGramExtractor(stop, lemmatizer);
//3. A word extractor that extracts single words as candidate terms.
//CandidateTermExtractor wordextractor = new WordExtractor(stop, lemmatizer);
------------------
Disable the npextractor but use the noun phrase extractor, i.e., option 1.
If that fixes the problem, it should be the problem of allocated memory.
Original comment by ziqizhan...@googlemail.com
on 12 Jun 2012 at 5:44
Issue closed
Original comment by ziqizhan...@googlemail.com
on 25 Jul 2013 at 10:05
Original issue reported on code.google.com by
ss401...@gmail.com
on 12 Jun 2012 at 4:46