Crawling fails with "This store is closed [1.4.196/4]" error message

SaschaHeyer commented 5 years ago

Hi Pascal,

we have a long crawling process which takes approx. 4 days to complete a single crawl run. Today we got the following error message and the crawler stoped crawling Crawler stopped.

Any idea what might cause this error? In what occasions usually the store gets closed?

Error message:

java.lang.IllegalStateException: This store is closed [1.4.196/4] 
at org.h2.mvstore.DataUtils.newIllegalStateException(DataUtils.java:765) 
at org.h2.mvstore.MVStore.checkOpen(MVStore.java:2409) 
at org.h2.mvstore.MVStore.getChunkIfFound(MVStore.java:949) 
at org.h2.mvstore.MVStore.getChunk(MVStore.java:935) 
at org.h2.mvstore.MVStore.readPage(MVStore.java:1943) 
at org.h2.mvstore.MVMap.readPage(MVMap.java:741) 
at org.h2.mvstore.Page.getChildPage(Page.java:217) 
at org.h2.mvstore.MVMap.binarySearch(MVMap.java:473) 
at org.h2.mvstore.MVMap.get(MVMap.java:455) 
at org.h2.mvstore.MVMap.containsKey(MVMap.java:484) 
at com.norconex.collector.core.data.store.impl.mvstore.MVStoreCrawlDataStore.isActive(MVStoreCrawlDataStore.java:135) 
at com.norconex.collector.core.pipeline.queue.QueueReferenceStage.execute(QueueReferenceStage.java:50) 
at com.norconex.collector.core.pipeline.queue.QueueReferenceStage.execute(QueueReferenceStage.java:29) 
at com.norconex.commons.lang.pipeline.Pipeline.execute(Pipeline.java:91) 
at com.norconex.collector.http.pipeline.importer.LinkExtractorStage.queueURL(LinkExtractorStage.java:159) 
at com.norconex.collector.http.pipeline.importer.LinkExtractorStage.executeStage(LinkExtractorStage.java:90) 
at com.norconex.collector.http.pipeline.importer.AbstractImporterStage.execute(AbstractImporterStage.java:31)

Best regards Sascha

essiembre commented 5 years ago

Have you tried that crawl of yours with the fix detailed in https://github.com/Norconex/collector-http/issues/620#issuecomment-516183396? It looks related.

You should try with the autoCompactFillRate flag, but you should also check that you have the latest H2 jar (and you no longer have the older jar in your classpath).

Your error suggests you are using H2/MVStore 1.4.196 while the latest snapshot upgrades the H2 dependency and uses 1.4.199.

SaschaHeyer commented 5 years ago

Hi Pasca,

yes we applied already the autoCompactFillRate flag and using the latest snapshot.

Since then the error message also showing the 1.4.999 version

java.lang.IllegalStateException: This store is closed [1.4.199/4]

I assume this issue is might be related to google-cloudsearch/norconex-committer-plugin#15.

Best regards Sascha

essiembre commented 5 years ago

Hello @SaschaHeyer, I had a look at the thread at https://github.com/google-cloudsearch/norconex-committer-plugin/issues/15 and I saw you provided easy steps to reproduce. I will give it a try and let you know. Since you reproduced the behavior with different Committers, it definitely looks like a bug to me.

essiembre commented 5 years ago

I am having trouble reproducing.

Can you tell if the lock is from Java, or maybe an OS file lock (e.g. not a rare Windows issue)? To find out, is it possible for you to issue a breakpoint where it fails, in combination with this "file leak detector tool": https://file-leak-detector.kohsuke.org/ (or else)?

srinicodebytes commented 5 years ago

Hi Pascal,

As mentioned in the linked issue google-cloudsearch/norconex-committer-plugin#15, "The file is locked:" error happens when another Norconex instance is started while the committer is still processing the files.

The trick is to let the committer process for long enough time ( ex: 10 sec or more), so that another Norconex instance can be started. This can be done by either of two ways:

Have large number of files for processing so that commiter takes 10-15 sec on the test environment.
Introducing a delay in committer for this testing. For example, introduce a delay of 10 sec right above https://github.com/Norconex/committer-sql/blob/master/norconex-committer-sql/src/main/java/com/norconex/committer/sql/SQLCommitter.java#L465.

Hope this helps in reproducing the issue in SQL Committer.

Best Regards, Srinivas.

essiembre commented 5 years ago

Update: I had a hard time reproducing but I finally was able with any Committer, when starting two instances of the same HTTP Collector at (roughly) the same time. I am investigating further.

essiembre commented 5 years ago

Try the latest HTTP Collector snapshot. The fix is in the norconex-jef dependency. I introduced better detection of running instance that prevents the type of overlap exception encountered. Please confirm.

SaschaHeyer commented 4 years ago

Hi Pascal,

we plan test the new snapshot next week, I will let you know. Thanks a lot.

Norconex / crawlers

Crawling fails with "This store is closed [1.4.196/4]" error message #634