chin-rcip / CRITERIA

Cidoc cRm In Triples mERmaid dIagrAms
MIT License
22 stars 6 forks source link

IndexError with ontology type #6

Closed ncarboni closed 4 years ago

ncarboni commented 4 years ago

I am testing the script and it works great with type instances. I have problems, however, in using with the type "ontology". If I use the command python criteria.py ontology turtle_name.ttl names.mmd What I get is

Traceback (most recent call last):
  File "criteria.py", line 257, in <module>
    main(args.Type, args.rdf, args.mmd)
  File "criteria.py", line 241, in main
    ontology(rdf, mmd)
  File "criteria.py", line 220, in ontology
    clType = re.findall('\["crm:.*"]:::.*', stmt)[0] # get the class part
IndexError: list index out of range

the content of turtle_name.ttl is:

@prefix crm: <http://www.cidoc-crm.org/cidoc-crm/> .
@prefix frbr: <http://iflastandards.info/ns/fr/frbr/frbroo/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://www.srdm.org/actor/E21> a crm:E21_Person ;
    crm:P1_is_identified_by <https://www.srdm.org/name/fie_10_1>,
        <https://www.srdm.org/name/fie_1_1>,
        <https://www.srdm.org/name/fie_5_1> ;
    crm:P2_has_type <https://www.srdm.org/type/fie_17_1> .

<https://www.srdm.org/name/fie_17_1> a crm:E55_Type .

<http://vocab.getty.edu/page/aat/300404670> a crm:E55_Type ;
    rdfs:label "preferred terms" .

<http://www.srdm.org/type/fie_2_1> a crm:E55_Type .

<https://www.srdm.org/actor/fie_15_2> a crm:E39_Actor .

<https://www.srdm.org/actor/fie_3_2> a crm:E39_Actor .

<https://www.srdm.org/conceptual_object/fie_16_1> a crm:E73_Information_Object .

<https://www.srdm.org/conceptual_object/fie_4_1> a crm:E73_Information_Object .

<https://www.srdm.org/event/fie_15_1> a crm:E13_Attribute_Assignment ;
    crm:P140_assigned_attribute_to <https://www.srdm.org/event/fie_13_1> ;
    crm:P14_carried_out_by <https://www.srdm.org/actor/fie_15_2> ;
    crm:P16_used_specific_object <https://www.srdm.org/conceptual_object/fie_16_1> .

<https://www.srdm.org/event/fie_3_1> a crm:E15_Type_Assignment ;
    crm:P14_carried_out_by <https://www.srdm.org/actor/fie_3_2> ;
    crm:P16_used_specific_object <https://www.srdm.org/conceptual_object/fie_4_1> .

<https://www.srdm.org/name/fie_10_1> a crm:E33_E41_Linguistic_Appellation ;
    frbr:R64i_was_name_used_by <https://www.srdm.org/event/fie_13_1> ;
    crm:P141i_was_assigned_by <https://www.srdm.org/event/fie_15_1> ;
    crm:P190_has_symbolic_content "content" ;
    crm:p2_has_type <https://www.srdm.org/type/fie_11_1> ;
    crm:p72_has_language <https://www.srdm.org/name/fie_12_1> .

<https://www.srdm.org/name/fie_12_1> a crm:E56_Language .

<https://www.srdm.org/name/fie_1_1> a crm:E42_Identifier ;
    crm:P190_has_symbolic_content "content" ;
    crm:P2_has_type <http://www.srdm.org/type/fie_2_1> ;
    crm:P37i_was_assigned_by <https://www.srdm.org/event/fie_3_1> .

<https://www.srdm.org/name/fie_5_1> a crm:E33_E41_Linguistic_Appellation ;
    crm:P190_has_symbolic_content "content" ;
    crm:p106_is_composed_of <https://www.srdm.org/name/fie_7_1> ;
    crm:p2_has_type <http://vocab.getty.edu/page/aat/300404670> ;
    crm:p72_has_language <https://www.srdm.org/name/fie_6_1> .

<https://www.srdm.org/name/fie_6_1> a crm:E56_Language .

<https://www.srdm.org/name/fie_7_1> a crm:E33_E41_Linguistic_Appellation ;
    rdfs:label "content" ;
    crm:p2_has_type <https://www.srdm.org/type/fie_8_1> .

<https://www.srdm.org/time_span/fie_13_2> a crm:E52_Time-Span ;
    crm:P82a_begin_of_the_begin ""^^xsd:dateTime ;
    crm:P82b_end_of_the_end ""^^xsd:dateTime .

<https://www.srdm.org/type/fie_11_1> a crm:E55_Type .

<https://www.srdm.org/type/fie_8_1> a crm:E55_Type .

<https://www.srdm.org/event/fie_13_1> a frbr:F52_Name_Use_Activity ;
    crm:P4_has_time-span <https://www.srdm.org/time_span/fie_13_2> .

Any ideas what the problem is?

TrangDg commented 4 years ago

Thank you @ncarboni for raising the issue. I just fixed the IndexError by editing the regex expression to find the classes. The previous expression only took into account classes with crm: prefix, thus missing out other prefixes such as frbroo.

However, there are a few things you need to edit on your end:

ncarboni commented 4 years ago

Thank you @ncarboni for raising the issue. I just fixed the IndexError by editing the regex expression to find the classes. The previous expression only took into account classes with crm: prefix, thus missing out other prefixes such as frbroo.

However, there are a few things you need to edit on your end:

  • make sure the classes you use in your turtle match with the ontologies' rdf files. For instance:

    • E33_E41_Linguistic_Appellation wasn't defined in cidoc crm rdf in ./src/ontologies.
    • E15 should be Identifier_Assignment not Type_Assignment in your .ttl, otherwise you should edit cidoc crm rdf instead.
  • I suspect that https://www.srdm.org/name/fie_17_1 should be https://www.srdm.org/type/fie_17_1.
  • In the script, the prefix for <http://iflastandards.info/ns/fr/frbr/frbroo/> was hardcoded as frbroo, instead of frbr, so you should also update this in your .ttl. I'm planning to eventually update this issue so that the script can use the prefixes defined in the input turtle.

Thank you! I only checked if the syntax was ok and indeed I did not noticed the errors. I will have to go through all the turtle and reconstruct them a bit.. but that aside, it works perfectly now! :-) Thanks!