Closed AntonioAmore closed 10 years ago
Have you configured beforehand what metada field holds your content? If you don't it will take the document content stream as the source content field. In any case, the mapping is done for you and you have to rely on target fields, not source fields. The source fields are deleted after the mapping is performed (unless you flag if to preserve the source).
I recommend your read the javadoc for the AbstractMappedCommiter for more details: http://www.norconex.com/product/committer/apidocs/com/norconex/committer/AbstractMappedCommitter.html
Ignore the references to IDOL. Those have to be corrected.
I haven't configured any fields mapping, keeping defaults. And I've read the doc before asking the question - it's a pity, but I can't get myself how to use it.
I tried java String content = metadata.getString(this.getContentTargetField());
to get page's content, but received the same error.
I just want to get crawled page content to a variable in any configuration/mapping case (or most of). It were an idea to write the committer as generic as possible and contribute to community. Seems the task is too difficult for me now. Could you help me with this line of code?
If you have not defined a target field for your content, the content won't be mapped to a metadata field and it explains why you do not get any content back with your line of code.
In such case you can obtain the content this way:
IAddOperation operation = // your operation
InputStream is = operation.getContentStream();
// read the input stream
The reason you do not see code like this in existing committer implementations, such as Solr, is because they provide default target fields, so they are always specified (so content will always be mapped to a field automatically). In your case, you can also enforce a default, or check in your code if the target field has been specified to establish whether to read content from the stream or from the metadata field.
Setting of default fields names, as done at SolrCommiter does play for me. Thanks a lot!
Glad to know!
At committer.commitBatch() function I try to get page's content for database writing.
Got NULL pointer exception at the last line of the listing. Have I misunderstood usage of the method?