Hi
I am using norconex filesystem collector to crawl files from shared path. I am trying the commit the processed items to Elastic search and File committer. It is not committing to Elastic search/Solr but getting saved into file system.
PFB the config file. Please help me to resolve the issue.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xml>
<!--
Copyright 2010-2017 Norconex Inc.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<fscollector id="Text Files">
## Either uncomment or set the following variables or create yourself a
## sample-config.variables (or properties) with the same variables set.
#set($path = "valid path")
#set($workdir = "E:\filesystem\norconex-collector-filesystem-2.8.0\norconex-collector-filesystem-2.8.0\examples")
#set($tagger = "com.norconex.importer.handler.tagger.impl")
#set($transformer = "com.norconex.importer.handler.transformer.impl")
<logsDir>${workdir}/logs</logsDir>
<progressDir>${workdir}/progress</progressDir>
<crawlers>
<crawler id="Sample Crawler">
<workDir>${workdir}</workDir>
<startPaths>
<path>${path}</path>
</startPaths>
<numThreads>2</numThreads>
<keepDownloads>false</keepDownloads>
<importer>
<postParseHandlers>
<tagger class="${tagger}.ReplaceTagger">
<replace fromField="samplefield" regex="true">
<fromValue>ping</fromValue><toValue>pong</toValue>
</replace>
<replace fromField="Subject" regex="true">
<fromValue>Sample to crawl</fromValue><toValue>Sample crawled</toValue>
</replace>
</tagger>
</postParseHandlers>
</importer>
<committer class="com.norconex.committer.elasticsearch.ElasticsearchCommitter">
<nodes>http://localhost:9200</nodes>
<indexName>filetest</indexName>
<typeName>filetest1</typeName>
</committer>
<committer class="com.norconex.committer.core.impl.JSONFileCommitter">
<directory>${workdir}/jsoncrawledFiles</directory>
<pretty>true</pretty>
<!-- <docsPerFile>(max number of docs per JSON file)</docsPerFile> -->
<!-- <compress>[false|true]</compress> -->
<splitAddDelete>true</splitAddDelete>
<fileNamePrefix>test</fileNamePrefix>
<fileNameSuffix>json</fileNameSuffix>
</committer>
<committer class="com.norconex.committer.core.impl.FileSystemCommitter">
<directory>${workdir}/crawledFiles</directory>
</committer>
</crawler>
</crawlers>
</fscollector>
You cannot have multiple committers defined like you are doing. One is simply ignored. Either use just one, or if you need multiple, you can wrap them both into a MultiCommitter.
Hi I am using norconex filesystem collector to crawl files from shared path. I am trying the commit the processed items to Elastic search and File committer. It is not committing to Elastic search/Solr but getting saved into file system. PFB the config file. Please help me to resolve the issue.