ArchivesPortalEuropeFoundation / Topic-Detection

Using machine learning approaches for automatic topic detection in a multilingual environment
6 stars 0 forks source link

[Check if this works now!] Tokenize text before searching for entities (issue with Liege: Auflistung von Liegenschaften bzw. anderen Gebäuden) #33

Open fedenanni opened 3 years ago

kerstarno commented 2 years ago

For context:

I've searched with "Lüttich" (German) as an entity asking for 50 results. The place entity is recognised correctly, i.e. a relation between "Lüttich" and "Liege" is made. But I also get results where "Liege" is the start of some other words (in this case in German as well) that have nothing to do with the place "Lüttich", e.g.

fedenanni commented 2 years ago

Should be fixed in 5134a4c - to be tested on the dev interface after it has been updated

fedenanni commented 2 years ago

Confirming that the issue is now solved on the dev branch (so the dev interface), see the examples below.

Screenshot 2022-05-02 at 11 06 45 Screenshot 2022-05-02 at 11 08 53