Norconex / crawlers

Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or filesystem to various data repositories such as search engines.
https://opensource.norconex.com/crawlers
Apache License 2.0
183 stars 68 forks source link

Append or suffix string to value of metadata #537

Closed kristiWabion closed 5 years ago

kristiWabion commented 5 years ago

Hi, I want to suffix the value of one metadata field. I tried using com.norconex.importer.handler.transformer.impl.ReplaceTransformer in the preParseHandlers.

For ex. title="test" to get transformed to "test1234"

<transformer class="com.norconex.importer.handler.transformer.impl.ReplaceTransformer" caseSensitive="false">
    <restrictTo caseSensitive="false"field="title">       
    </restrictTo>             
    <replace>
        <fromValue>$</fromValue>
        <toValue>1234</toValue>
    </replace>
</transformer>

But it didn't work. Can this be accomplished? Thank you

kristiWabion commented 5 years ago

Hello, I figured out how to solve this case.

<tagger class="com.norconex.importer.handler.tagger.impl.ReplaceTagger">
          <replace fromField="Server" regex="true">
              <fromValue>(.+)</fromValue>
              <toValue>$0?bhjhjb=</toValue>
          </replace>
</tagger>