Open dmcassel opened 6 years ago
Possible approach: keep track of queries (configured in add
and expand
) that contribute to specific properties. Run just those in combination against the set of matched documents and adjust the score accordingly. For instance, given:
<options xmlns="http://marklogic.com/smart-mastering/matcher">
<property-defs>
<property namespace="" localname="PersonGivenName" name="first-name"/>
<property namespace="" localname="PersonSurName" name="last-name"/>
<property namespace="" localname="AddressPrivateMailboxText" name="addr1"/>
</property-defs>
<algorithms>
<algorithm name="std-reduce" function="standard-reduction"/>
<algorithm name="dbl-metaphone" function="double-metaphone"/>
<algorithm name="thesaurus" function="thesaurus"/>
</algorithms>
<scoring>
<add property-name="last-name" weight="8"/>
<add property-name="first-name" weight="6"/>
<add property-name="addr1" weight="5"/>
<expand property-name="first-name" algorithm-ref="thesaurus" weight="6">
<thesaurus>/mdm/config/thesauri/first-name-synonyms.xml</thesaurus>
<distance-threshold>50</distance-threshold>
</expand>
<expand property-name="last-name" algorithm-ref="dbl-metaphone" weight="8">
<dictionary>name-dictionary.xml</dictionary>
<!--defaults to 100 distance -->
</expand>
<reduce algorithm-ref="std-reduce" weight="4">
<all-match>
<property>last-name</property>
<property>addr1</property>
</all-match>
</reduce>
</scoring>
<thresholds>
<threshold above="50" label="Likely Match" action="notify"/>
<threshold above="68" label="Definitive Match" action="merge"/>
</thresholds>
<tuning>
<max-scan>200</max-scan>
</tuning>
</options>
After running matching, we'll have a list of documents with their current match scores. Run another search:
cts:and-query((
(: any query related to the last-name property :)
(: any query related to the addr1 property :)
cts:document-query( (: sequence of URIs of the matched documents :) )
))
For anything that matches, reduce the score by the reduce
weight.
➤ Kasey Alderete commented:
Include an example.
Could be to not get matches then reduce, just run query with negative weight, eg.
The standard-reduction.xqy module has two functions:
standard-reduction
andstandard-reduction-query
. It doesn’t look likestandard-reduction-query
is actually used anywhere, although there are orphaned references to it in several of our example match options. Thestandard-reduction-query
likely can't work, since it relies on combined queries, but acts:and-query
doesn't take a weight.Actions:
standard-reduction-query
standard-reduction
requires matches