as blocking step, we retrieve via ElasticSearch n best records (list of MatchingDocument objects, n=5 currently and it is defined in the configuration file)
as pairwise matching step, we compute and rank the n best records according to a pairwise distance of the expected/candidate fields
best ranked candidate is returned if it passes the post validation
factorize/simplify search cases
make all components (lookup, indexer and pubmed-glutton) use a unique yaml config file
update pubmed-glutton (ES7, gradle 7)
We probably want to remove the post validation step and parameters, and just manage validation via the pairwise ranking. If the matching score of the pairwise ranking is below a given threshold, candidate is not valid anyway and we don't return it. Post validation and pairwise ranking are redundant and creates confusion (post-validation is pairwise ranking of the poor!).
This is a follow-up of #61
We probably want to remove the post validation step and parameters, and just manage validation via the pairwise ranking. If the matching score of the pairwise ranking is below a given threshold, candidate is not valid anyway and we don't return it. Post validation and pairwise ranking are redundant and creates confusion (post-validation is pairwise ranking of the poor!).