esmero / webform_strawberryfield

Provides Webform integrations to feed a field of Strawberries. Mr. Wizard of WebOz
GNU Lesser General Public License v3.0
2 stars 6 forks source link

Use Levenshtein distance to sort LoD Controller results. Much more accurate, less fuzz #171

Open DiegoPino opened 8 months ago

DiegoPino commented 8 months ago

What?

We return results from remote LoD APIs sorted by what the API provides. But really, from the the user/metadata cataloger perspective we want to return things listed by "how close" they are to the original input/search string.

e.g the first term for input "Music" in LoC Subjects is "MUSIC (Computers)"... but the user searched for Music. That is where my little little knowledge of data science comes in place: Levenshtein.

A simple sort-by fixes the issue

See how that/this is better when i use my brainz to add a line of code.

image

This also helps a lot with unsupervised LoD reconciliation at an AMI level or when using ami_lod_reconcile(subject|lower|capitalize,'loc;subjects;thing','en',1) in a twig template.

@alliomeria thoughts?