kermitt2 / biblio-glutton

A high performance bibliographic information service: https://biblio-glutton.readthedocs.io
117 stars 15 forks source link

Add proper blocking/pairwise matching double step #62

Closed kermitt2 closed 2 years ago

kermitt2 commented 2 years ago

This is a follow-up of #61

We probably want to remove the post validation step and parameters, and just manage validation via the pairwise ranking. If the matching score of the pairwise ranking is below a given threshold, candidate is not valid anyway and we don't return it. Post validation and pairwise ranking are redundant and creates confusion (post-validation is pairwise ranking of the poor!).

kermitt2 commented 2 years ago

follow up in #66