Closed Nycander closed 10 years ago
As a workaround, I'm using the Derby link database implementation. It seems to be working fine
Thanks for reporting this. I am glad the Derby implementation works fine for you. The down side is performance on large sites. Derby performance does not scale as well as MapDB. But if your site is of small enough size, it may not make a big difference.
It turns out the exception is a bug of the MapDB library used by the HTTP Collector (described here: https://github.com/jankotek/MapDB/issues/274).
My recommendation is to replace the existing lib/mapdb-0.9.8.jar with the latest version here: http://search.maven.org/remotecontent?filepath=org/mapdb/mapdb/0.9.9/mapdb-0.9.9.jar
Let me know if the latest version of that library fixes the problem.
Another suggestion, is try try increasing your delay. From that same MapDB ticket above:
windows might file locks a few miliseconds after file was closed,
so we need an loop which would retry to open file for lets say 500ms.
Since file locking has been worked on in MapDB 0.9.9, I hope it will be sufficient for you to simply upgrade that lib.
Just to let you know MapDB 0.9.10 was released. Some users reported unreleased file locks on windows. That should be solved now.
Awesome! Thanks for the update. Our next release will include 0.9.10. Nycander, can you give it a try and report if it solves your issue?
Sorry for the delay, but I had to prioritise other things in the project I'm working on.
But now I've had some time to test this out. I've dropped in MapDB 0.9.10 and it seem to be working :)
My hope is that MapDB will have better disk performance than Derby.
Great, thanks for the feedback. MapDB speed does not compare with Derby. The more documents you attempt to crawl, the more you should see a difference. I'll close this ticket when the next release of Norconex HTTP Collector is out (should be this week).
I only crawl about 1000 documents, but the hardware is a really bad SAN.
Initial findings seem to indicate that disk performance is much better using MapDB :+1:
Hi,
there will be out new MapDB release 0.9.11 which changes files handling a lot.
Thanks for the info @jankotek
Norconex HTTP Collector 1.3 is now out with MapDB 0.9.10.
@Nycander I am closing this ticket but if this issue arise again please open a new one and we shall release a patched version that includes Mapdb 0.9.11 (or newer).
I'm running on
This exception (se below) occurs quite frequently in the logs. It seems to happen farily random and I'm not sure how to reproduce it.
Here's my main configuration: