What steps will reproduce the problem?
(using ver. 1.2.0)
1. HTMLParse "http://worldwidescience.org/topicpages/s.html". ArticleExtractor
is just fine for demonstration purposes.
With 8GB of JVM-memory, this will result in an out of memory exception.
Attached is a patch, which allows limiting the amount of TextBlocks being
created/appended by boilerpipe. If that limit is reached, boilerpipe will
ignore all further content from the parsed input.
Original issue reported on code.google.com by mstr...@gmail.com on 25 Nov 2013 at 4:29
Original issue reported on code.google.com by
mstr...@gmail.com
on 25 Nov 2013 at 4:29Attachments: