Closed kalhomoud closed 11 years ago
This one works as expected. For the crawler to know about a document metadata, it has to parse it first. If you simply change "preParseHandlers" to "postParseHandlers" it will work. The reason you have "some" metadata available in pre-parse handlers is because whatever the crawler could find from the HTTP Header or extracting URLs is added as extra metadata. To make sure that extra metadata is not mixed up with actual document metadata once the document is parsed, they are prefixed with "collector.http.".
Hello,
For some reason, RenameTagger is only working when the field name is starting with collector.http.* such as collector.http.MIMETYPE. It didn't work with me when the field name was "support_url" and "dc:title".
Here is how I have it setup: . . . .
. . . .
Please let me know if you need my config to reproduce the issue.
Thanks, Khalid