Closed liar666 closed 8 years ago
The Filesystem collector is a different product form HTTP Collector. Even if they are built on a modular framework and share many of the same libraries, they are meant to be run separate from one another and will continue to be shipped entirely separately for that reason.
About committers, we recently added install scripts so they can more easily be copied into the product of your choice. We hope this will make life easier. We may one day consider making committers auto-downloadable, but that's not currently on the radar.
Keep in mind that if you keep using recent versions of both products, the importer and committer modules should be the same (or nearly) and thus should support the same configs. So nothing prevents you from sharing importer/committer config fragments between the two even if they are separate installs. I encourage to share config snippets so you do not repeat your common configs if that is a concern, but I would not encourage to share the libs so the two products can continue to evolve differently as needed (even if yes, most of the dependencies are the same).
I'm not sure it's worth opening ticket for this, but I've got remark on the way the various components of the Norconex's "solution" are distributed: I recently had to use a collector-filesystem rather than my usual collector-http. I found it quite strange that I had to download a new .zip file (that is full of libs I already have in my filesystem), unzip it in a new dir, reconfigure my crawling environment in this new dir (adding my own classes, libs & configuration files in the corresponding subdirs, etc), and use an new ./collector-fs.sh launching script, just because I switched from a Http source to a File source for my data.
I now I probably could have simply extracted the corresponding norconex-collector-filesystem-xxx.jar and put it in my old 'norconex-collector-http-xxx' dir and things would probably have worked as well, but I opted for what I thought the simplest way at the moment: basing my work on the way the software is distributed.
Why isn't the norconex software bundled as a whole and the jars to use selected according to what components are used in the XML configuration file (this might be related to https://github.com/Norconex/importer/issues/27)? Also, optional elements (e.g. committers) could then be distributed as "plugins" (that would be simply/automatically dowloaded&added to the ./libs dir on demand)?