opendatakerala / map.opendatakerala.org

LSG Level Portal Based On OSM Data for Map Kerala Campaign
17 stars 3 forks source link

ZWJ missing from name ? #42

Open subins2000 opened 2 years ago

subins2000 commented 2 years ago

Awesome work! Just checked it out. Found that the placenames in website has ZWJ missing in it. This gives incorrect place names. I checked OSM node, the chil used there is ല + ് + ZWJ : https://www.openstreetmap.org/relation/11312298

Maybe all chil in OSM names should be migrated to atomic chil ? Doesn't the current use of ZWJ in names of OSM affect searching ? or just fix the script that made this webpage

Screenshot_20211031_214936 Screenshot_20211031_215001 Screenshot_20211031_215437

manojkmohan commented 2 years ago

Yes. Known issue with Wikidata. Need to check all attributes with running a bot. Data entry issue. Its better with osm i think. @naveenpf @jinoytommanjaly

manojkmohan commented 2 years ago

Ohh. OSM also have same issue!. Mostly the things came via wikidata. There is no normalization scrip working in wikidata like ml.wikipedia

jinoytommanjaly commented 2 years ago

Most of the labels of the panchayat were fetched into Wikidata from Malayalam wiki long before by bots or by users. So most of them will have ZWJ characters on the labels. In WD we can have one label for every entity as well as several names can be added to Aliases. So we can use the names without ZWJ characters in the label filed and with ZWJ characters as aliases.

I have created a Google sheet that lists the panchayat labels from Wikidata which doesn't match the label in Wikipedia. https://docs.google.com/spreadsheets/d/1U8cCNhUx7u_nKOvuLbuwvTjpOW-2GAnS/edit#gid=1579957521