We will show how to create a DeezyMatch models that are trained on Arabic name variations and on Latin name variations of places in modern-day Libya, which will enable us to find the best entry in a gazetteer, for a given query. This model could be used to consolidate data about names of heritage locations in Arabic speaking countries, like in the Heritage Gazetteer of Libya. Currently, the high level of spelling variation in Arabic placenames (across time and transcriptions) makes it difficult to consolidate data that lies in different archives and collections, which at the moment rely on perfect string matching to find connections. We will show how DeezyMatch can be used to more easily associate an heritage location to a number of variant names, thus improving accuracy of data and metadata, and facilitating alignement with other knowledge bases such as Wikidata or Geonames.
Prepare tutorial on using DeezyMatch with the Heritage Gazetteer of Libya: https://dh2022.adho.org/workshops-and-tutorials/wt-13