Norconex Importer is a Java library and command-line application meant to "parse" and "extract" content out of a file as plain text, whatever its format (HTML, PDF, Word, etc). In addition, it allows you to perform any manipulation on the extracted text before using it in your own service or application.
1 XML configuration errors detected:
[XML] StartCommand: cvc-complex-type.2.4.a: Invalid content was found starting with element 'fieldMatcher'. One of '{restrictTo}' is expected.
hello Pascal,
I'd like to use several methods (e.g.
csv
andregex
) in theKeepOnlyTagger
, but it seems, only onefieldMatcher
is allowed:Error:
How to do that with the 3.x? Thanks!