Norconex / committer-sql

Implementation of Norconex Committer for SQL (JDBC) databases.
https://opensource.norconex.com/committers/sql/
Apache License 2.0
1 stars 6 forks source link

SQL commits jumbled when running two crawlers at the same time #9

Closed hardreddata closed 4 years ago

hardreddata commented 4 years ago

Hi,

I am using the filesystem crawler with SQL committer. It works great!

Today I tried running two crawlers at the same time.

consider crawl-one.variables which has an associated crawl-one.xml

path = \\storage\path_one 
workdir = ./crawl-one

And also the same for crawl-two

path = \\storage\path_two
workdir = ./crawl-two

These are configured to commit to different tables in the same database.

Oddly the database table for crawl-one contains results from crawl-two, and so forth.

The 32_Crawler.log within the working directories is pure

I tried naming the fscollector id and crawler id uniquely in each configuration file but still have a problem.

I am running the 2.9.1 Snapshot of the collector filesystem.

Thinking aloud, the committer-queue folder is perhaps common to both of my processes? I will try running two completely separate 2.9.1 snapshot folders. Regardless I thought worth mentioning this.

https://github.com/Norconex/committer-core/issues/9 looks related.

Any advice is invited.

essiembre commented 4 years ago

This issue happens when two committers point to the same directory used for queueing documents. You can avoid this issue by adding a distinct <queueDir>(optional path where to queue files)</queueDir> to each of your committers.

hardreddata commented 4 years ago

That did it. Thanks.