Closed danizen closed 7 years ago
No, the ReduceConsecutivesTransformer
will keep one instance of what you specify.
If you want more control, use the ReplaceTransformer. For instance, this should do what you want (put everything on one line):
<transformer class="com.norconex.importer.handler.transformer.impl.ReplaceTransformer">
<replace>
<fromValue>\s+</fromValue>
<toValue xml:space="preserve"> </toValue>
</replace>
</transformer>
Tags containing only white-spaces are stripped by default. To preserve them, you need to add xml:space="preserve
like above.
OK - am I right in understanding that transformers are run on the content and taggers are run on the meta-data, or do I have a misunderstanding.
That is exactly the idea, yes. Taggers can also read the content, but only transformers can modify it. Technically, transformers can be implemented to deal with both if you need to, but the the ones available focus on content only.
Can I get this to apply to the content, and smush it all into a single-line?
Thanks