Closed mitchelljj closed 9 years ago
There probably a few ways you could do this, but I suggest you look at the ReplaceTagger part of the Importer module. You could use it in a way similar to this:
<tagger class="com.norconex.importer.handler.tagger.impl.ReplaceTagger">
<replace fromField="YourFieldHavingTheValue" toField="Institutions_of_Higher_Education"
regex="true">
<fromValue>.*DC-ED\.audience.*Institutions of Higher Education.*</fromValue>
<toValue>true</toValue>
</replace>
<replace fromField="Institutions_of_Higher_Education" regex="true">
<fromValue>^(?!true)*$</fromValue>
<toValue></toValue>
</replace>
</tagger>
If your DC-ED.audience field is not a metadata field already extracted but is part of the body, you can do something similar with TextPatternTagger.
Thanks for the information! So if you don't add the second section within ReplaceTagger which I believe replaces any case of not true with no value what will the field of "Institutions_of_Higher_Education" contain or will this field not display within those records?
I have not tested your particular case, but I believe it will copy the content over as is if it could not perform a replace. So you are correct, the second one is to make sure it is blank when not "true". In fact, it maybe best to store "false" instead of blank in your case.
I have a field called "DC-ED.audience" that contains multiple strings that are separated by commas (see below example): "DC-ED.audience":["Institutions of Higher Education", "Administrators; Counselors"]
If I create new fields within Solr before doing the initial crawl like a "Institutions of Higher Education" field then when doing the crawl I would like to key on the "DC-ED.audience" field and when a string like "Institutions of Higher Education" is found update the new "Institutions of Higher Education" field with the value of "TRUE".