Closed SaschaHeyer closed 4 years ago
Have you tried that crawl of yours with the fix detailed in https://github.com/Norconex/collector-http/issues/620#issuecomment-516183396? It looks related.
You should try with the autoCompactFillRate
flag, but you should also check that you have the latest H2 jar (and you no longer have the older jar in your classpath).
Your error suggests you are using H2/MVStore 1.4.196
while the latest snapshot upgrades the H2 dependency and uses 1.4.199
.
Hi Pasca,
yes we applied already the autoCompactFillRate flag and using the latest snapshot.
Since then the error message also showing the 1.4.999 version
java.lang.IllegalStateException: This store is closed [1.4.199/4]
I assume this issue is might be related to google-cloudsearch/norconex-committer-plugin#15.
Best regards Sascha
Hello @SaschaHeyer, I had a look at the thread at https://github.com/google-cloudsearch/norconex-committer-plugin/issues/15 and I saw you provided easy steps to reproduce. I will give it a try and let you know. Since you reproduced the behavior with different Committers, it definitely looks like a bug to me.
I am having trouble reproducing.
Can you tell if the lock is from Java, or maybe an OS file lock (e.g. not a rare Windows issue)? To find out, is it possible for you to issue a breakpoint where it fails, in combination with this "file leak detector tool": https://file-leak-detector.kohsuke.org/ (or else)?
Hi Pascal,
As mentioned in the linked issue google-cloudsearch/norconex-committer-plugin#15, "The file is locked:" error happens when another Norconex instance is started while the committer is still processing the files.
The trick is to let the committer process for long enough time ( ex: 10 sec or more), so that another Norconex instance can be started. This can be done by either of two ways:
Hope this helps in reproducing the issue in SQL Committer.
Best Regards, Srinivas.
Update: I had a hard time reproducing but I finally was able with any Committer, when starting two instances of the same HTTP Collector at (roughly) the same time. I am investigating further.
Try the latest HTTP Collector snapshot. The fix is in the norconex-jef
dependency. I introduced better detection of running instance that prevents the type of overlap exception encountered. Please confirm.
Hi Pascal,
we plan test the new snapshot next week, I will let you know. Thanks a lot.
Hi Pascal,
we have a long crawling process which takes approx. 4 days to complete a single crawl run. Today we got the following error message and the crawler stoped crawling
Crawler stopped.
Any idea what might cause this error? In what occasions usually the store gets closed?
Error message:
Best regards Sascha