Started looking into it, the most promising solution seems to be Flair, which covers 5 languages (EN, DE, FR, ES, NL) for named entity recognition. Then we could pass the output to wikidata to find the best match. Current issue is that Flair is incompatible with Torch 1.8, which is what we need in our pipeline: https://github.com/flairNLP/flair/issues/2137 (it needs torch 1.7).
We have a first version in 665e997 which takes a string and a language (currently supported: EN, DE, FR, NL) and returns the URLS of the identified entities
Started looking into it, the most promising solution seems to be Flair, which covers 5 languages (EN, DE, FR, ES, NL) for named entity recognition. Then we could pass the output to wikidata to find the best match. Current issue is that Flair is incompatible with Torch 1.8, which is what we need in our pipeline: https://github.com/flairNLP/flair/issues/2137 (it needs torch 1.7).