mohankreddy / crawler4j

Automatically exported from code.google.com/p/crawler4j
0 stars 0 forks source link

EnvironmentFailureException #65

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I am crawling all the links that are there in a particular page.. My 
application was working fine in the starting as it was crawling all the links.. 
but suddenly after some time I started getting this error continuously.. Any 
suggestion what is the meaning of these errors and why is it happening??

com.sleepycat.je.EnvironmentFailureException: (JE 4.0.71) JAVA_ERROR: Java 
Error occurred, recovery may not be possible.
        at com.sleepycat.je.dbi.EnvironmentImpl.checkIfInvalid(EnvironmentImpl.java:1387)
        at com.sleepycat.je.Database.checkEnv(Database.java:1766)
        at com.sleepycat.je.Database.get(Database.java:873)
        at edu.uci.ics.crawler4j.frontier.DocIDServer.getDocID(DocIDServer.java:71)
        at edu.uci.ics.crawler4j.crawler.WebCrawler.processPage(WebCrawler.java:179)
        at edu.uci.ics.crawler4j.crawler.WebCrawler.run(WebCrawler.java:108)
        at java.lang.Thread.run(Thread.java:662)
    Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
        at com.sleepycat.je.tree.Key.makeKey(Key.java:109)
        at com.sleepycat.je.dbi.CursorImpl.searchAndPosition(CursorImpl.java:2045)
        at com.sleepycat.je.Cursor.searchInternal(Cursor.java:2088)
        at com.sleepycat.je.Cursor.searchAllowPhantoms(Cursor.java:2058)
        at com.sleepycat.je.Cursor.search(Cursor.java:1926)
        at com.sleepycat.je.Database.get(Database.java:897)
        ... 4 more

Original issue reported on code.google.com by jamalrai...@gmail.com on 1 Aug 2011 at 6:08

GoogleCodeExporter commented 9 years ago
As the stacktrace says it is an out of memory problem. You can try increasing 
the heap space using -Xmx option.

Original comment by ganjisaffar@gmail.com on 1 Aug 2011 at 6:22

GoogleCodeExporter commented 9 years ago
Thanks for replying back.. I have increased the heap space... And if I also 
increase this setPolitenessDelay from 200 to 1000 then any effect will be 
there??
/* * Be polite:
* Make sure that we don't send more than 5 requests per 
* second (200 milliseconds between requests).
*/
controller.setPolitenessDelay(200);

Original comment by jamalrai...@gmail.com on 1 Aug 2011 at 6:30

GoogleCodeExporter commented 9 years ago
No, that is just the delay between sending new requests.

Original comment by ganjisaffar@gmail.com on 1 Aug 2011 at 6:53

GoogleCodeExporter commented 9 years ago
Although this is not a Berkeley DB Java Edition problem, per se (it looks more 
like a GC or JVM configuration problem) you may be able to get help by posting 
to the Berkeley DB Java Edition OTN forum here: http://bit.ly/e1AYFi

Original comment by dsegl...@gmail.com on 7 Aug 2011 at 11:48