FIAF / modelling-workshops

Modelling Workshops
0 stars 1 forks source link

Label translations. #17

Closed paulduchesne closed 1 year ago

paulduchesne commented 1 year ago

All entities need language translations for the three officially supported FIAF languages (English, French and Spanish). A small script to detect fill rate would be very handy.

paulduchesne commented 1 year ago

Small python script to check label coverage by language:

import rdflib

def label_langauge(entity, lang):

    ''' Find labels matching selected language'''

    match = (len([o for s,p,o in graph.triples((entity, rdflib.RDFS.label, None)) if o.language == lang]))
    if match == 1:
        return entity
    else:
        return None

graph = rdflib.Graph()
graph.parse('https://raw.githubusercontent.com/FIAF/modelling-workshops/main/ontology.ttl', format='ttl')

entities = [s for s,p,o in graph.triples((None, rdflib.RDF.type, rdflib.OWL.Class))]
entities += [s for s,p,o in graph.triples((None, rdflib.RDF.type, rdflib.OWL.ObjectProperty))]
entities += [s for s,p,o in graph.triples((None, rdflib.RDF.type, rdflib.OWL.DatatypeProperty))]

for l in ['en', 'es', 'fr']:
    tr = [label_langauge(x, l) for x in entities]
    tr = len([x for x in tr if x != None])
    print(f'{tr} of {len(entities)} entities with {l} translated label: {round((tr/len(entities))*100, 2)}%')

Current result: 945 of 1053 entities with en translated label: 89.74% 6 of 1053 entities with es translated label: 0.57% 6 of 1053 entities with fr translated label: 0.57%

paulduchesne commented 1 year ago

Many of the french translations are now in place for labels, three large groups to tackle programatically are activity, country and language.

paulduchesne commented 1 year ago

English and French translations now complete, Spanish at around 50%.

paulduchesne commented 1 year ago

Label translations now complete for three languages.