reconciliation-api / specs

Specifications of the reconciliation API
https://reconciliation-api.github.io/specs/draft/
31 stars 9 forks source link

full-text matching #57

Open VladimirAlexiev opened 3 years ago

VladimirAlexiev commented 3 years ago

I think that in many cases it will be useful to match rows by their textual content, using it as a general context for the entities in the KB.

NLP uses that all the time, eg "bank" as financial institution in an article about finance vs "bank" as a river feature in an article about nature or geography. Approaches include TF/IDF, word embeddings, etc.

Use cases:

Implementation:

wetneb commented 3 years ago

On the surface, this seems like something that can already be done with the current API, no? If not, what would we need to change in the specs?