RMLio / yarrrml-parser

A YARRRML parser library and CLI in Javascript
MIT License
41 stars 17 forks source link

`Not a valid (absolute) IRI` when passing rules to `rmlmapper -s turtle` #204

Open diegoquintanav opened 8 months ago

diegoquintanav commented 8 months ago

Issue type: :bug: Bug

Code is currently in https://github.com/diegoquintanav/pinochet-analyze-50/

I have the following dataset

individual_id,group_id,start_date_daily,end_date_daily,start_date_monthly,end_date_monthly,last_name,first_name,minor,age,male,occupation,occupation_detail,victim_affiliation,victim_affiliation_detail,violence,method,interrogation,torture,mistreatment,targeted,press,war_tribunal,number_previous_arrests,perpetrator_affiliation,perpetrator_affiliation_detail,nationality,page,place,location,latitude,longitude,exact_coordinates,location_n,geometry,location_id,event_id
1,1,1973-09-12,1973-09-12,1973-09-01,1973-09-01,Corredera Reyes,Mercedes del Pilar,True,,False,School Student,high school,,,Killed,Gun,,,,,False,False,,,,Chilean,159,,Calle Gran Avenida,-33.501343,-70.65424,False,1,0101000020E610000000000020DFA951C0000000002CC040C0,995,1
1,1,1973-09-12,1973-09-12,1973-09-01,1973-09-01,Corredera Reyes,Mercedes del Pilar,True,,False,School Student,high school,,,Killed,Gun,,,,,False,False,,,,Chilean,159,,Medical Legal Institute (by the Barros Luco Hospital),-33.484123,-70.64641,True,2,0101000020E6100000000000C05EA951C0000000C0F7BD40C0,429,2
2,2,1973-09-11,1973-09-12,1973-09-01,1973-09-01,Torres Torres,Benito Heriberto,False,57.0,True,Blue Collar,plumbing installer,,,Killed,Gun,,,,,False,False,,Regime,Policemen,Chilean,159-60,,Santiago,-33.44889,-70.66927,False,1,0101000020E610000000000060D5AA51C00000004075B940C0,2,3
2,2,1973-09-11,1973-09-12,1973-09-01,1973-09-01,Torres Torres,Benito Heriberto,False,57.0,True,Blue Collar,plumbing installer,,,Killed,Gun,,,,,False,False,,Regime,Policemen,Chilean,159-60,,Towards the 26th police station,-33.447845,-70.73953,False,2,0101000020E61000000000008054AF51C00000000053B940C0,1190,4
2,2,1973-09-11,1973-09-12,1973-09-01,1973-09-01,Torres Torres,Benito Heriberto,False,57.0,True,Blue Collar,plumbing installer,,,Killed,Gun,,,,,False,False,,Regime,Policemen,Chilean,159-60,,found in Las Barrancas ,-33.44222,-70.75389,False,3,0101000020E6100000000000C03FB051C0000000A09AB840C0,1215,5
3,3,1973-09-12,1973-09-12,1973-09-01,1973-09-01,Lira Morales,Juan Manuel,False,23.0,True,White Collar,office worker,,,Killed,Gun,False,False,,,False,False,,Regime,Military ,Chilean,160,,La Legua shantytown,-33.48722,-70.63556,False,1,0101000020E610000000000000ADA851C0000000405DBE40C0,29,6
3,3,1973-09-12,1973-09-12,1973-09-01,1973-09-01,Lira Morales,Juan Manuel,False,23.0,True,White Collar,office worker,,,Killed,Gun,False,False,,,False,False,,Regime,Military ,Chilean,160,,Barros Luco Hospital,-33.484123,-70.64641,True,2,0101000020E6100000000000C05EA951C0000000C0F7BD40C0,80,7
4,4,1973-09-12,1973-09-14,1973-09-01,1973-09-01,Fontela Alonso,Alberto Mariano,False,26.0,True,Blue Collar,small fisherman,,,Disappearance,,,,,,False,False,,Regime,Military (Tacna Regiment) ,Chilean,160,,Tacna Regiment,-33.596478,-70.704575,True,1,0101000020E6100000000000C017AD51C00000006059CC40C0,401,8
5,5,1973-09-12,1973-09-12,1973-09-01,1973-09-01,Quintilliano Cardozo,Tulio Roberto,False,29.0,True,Blue Collar,engineer,Opposition,Communist party,Disappearance,,True,,,,False,False,,Regime,Military ,Chilean,160-61,,Military Academy,-33.411545,-70.584206,True,1,0101000020E6100000000000A063A551C000000080ADB440C0,296,9

And this mapping

---
prefixes:
    grel: "http://users.ugent.be/~bjdmeest/function/grel.ttl#"
    rdfs: "http://www.w3.org/2000/01/rdf-schema#"
    schema: "https://schema.org/"
    d2s: "https://w3id.org/d2s/"
    e: "http://myontology.com/"
    dbo: "https://dbpedia.org/ontology/"
    rettig: "https://pinochet-rettig.301621.xyz/ontology#"

mappings:
    victims:
        graph: rettig:Victims
        sources:
            - ["data/stg_pinochet__base.csv~csv"]
        s: $(individual_id)
        po:
            - [a, "rettig:Victim"]
            - ["rettig:firstName", $(first_name)]
            - ["rettig:lastName", $(last_name)]
            - ["rettig:age", $(age), "xsd:integer"]
    locations:
        graph: rettig:Locations
        sources:
            - ["data/stg_pinochet__base.csv~csv"]
        s: $(location_id)
        po:
            - [a, "rettig:Location"]
            - ["rettig:latitude", $(latitude)]
            - ["rettig:longitude", $(longitude)]
            - ["rettig:isExactCoordinates", $(exact_coordinates)]
            - ["rettig:locationOrder", $(location_n)]
    events:
        graph: rettig:Events
        sources:
            - ["data/stg_pinochet__base.csv~csv"]
        s: $(event_id)
        po:
            - [a, "rettig:Event"]
            - ["rettig:startDate", $(start_date_monthly)]
            - ["rettig:endDate", $(end_date_monthly)]
            - p: rettig:hasPersonInvolved
              o:
                  mapping: victims
                  condition:
                      function: equal
                      parameters:
                          - [str1, $(event_id), s]
                          - [str2, $(individual_id), o]
            - p: rettig:hasLocation
              o:
                  mapping: locations
                  condition:
                      function: equal
                      parameters:
                          - [str1, $(event_id), s]
                          - [str2, $(location_id), o]

And when doing

yarrrml-parser -i mapping.yarrrml.yml -o rules.rml.ttl

and later transforming that to a .ttl file, by doing

docker run --rm -v $(pwd):/data rmlmapper -m rules.rml.ttl -s turtle > mapped.rml.ttl

I get an error

08:52:14.295 [main] ERROR be.ugent.rml.cli.Main               .run(416) - Not a valid (absolute) IRI: null1

I think this is happening because of empty values in the source CSV, which lead to empty triplets, as shown in https://github.com/RMLio/rmlmapper-java/issues/74. I have a NULL in age (mapped to an empty value) at the first two lines,

I noticed that if I use

docker run --rm -v $(pwd):/data rmlmapper -m rules.rml.ttl -s nquads

This is successfully skipping the mapping for this value

<null1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://pinochet-rettig.301621.xyz/ontology#Victim> <https://pinochet-rettig.301621.xyz/ontology#Victims>.
<null1> <https://pinochet-rettig.301621.xyz/ontology#firstName> "Mercedes del Pilar" <https://pinochet-rettig.301621.xyz/ontology#Victims>.
<null1> <https://pinochet-rettig.301621.xyz/ontology#lastName> "Corredera Reyes" <https://pinochet-rettig.301621.xyz/ontology#Victims>.
<null1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://pinochet-rettig.301621.xyz/ontology#Victim> <https://pinochet-rettig.301621.xyz/ontology#Victims>.
<null1> <https://pinochet-rettig.301621.xyz/ontology#firstName> "Mercedes del Pilar" <https://pinochet-rettig.301621.xyz/ontology#Victims>.
<null1> <https://pinochet-rettig.301621.xyz/ontology#lastName> "Corredera Reyes" <https://pinochet-rettig.301621.xyz/ontology#Victims>.
<null2> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://pinochet-rettig.301621.xyz/ontology#Victim> <https://pinochet-rettig.301621.xyz/ontology#Victims>.
<null2> <https://pinochet-rettig.301621.xyz/ontology#firstName> "Benito Heriberto" <https://pinochet-rettig.301621.xyz/ontology#Victims>.
<null2> <https://pinochet-rettig.301621.xyz/ontology#lastName> "Torres Torres" <https://pinochet-rettig.301621.xyz/ontology#Victims>.
<null2> <https://pinochet-rettig.301621.xyz/ontology#age> "57.0"^^<http://www.w3.org/2001/XMLSchema#integer> <https://pinochet-rettig.301621.xyz/ontology#Victims>.

Description

See above

Steps

See above

Environment

yarrrml-parser --version
1.6.1

node --version
v18.15.0
``