Open justin2004 opened 3 years ago
it looks like this example: https://github.com/RMLio/RML-Processor/tree/master/src/test/resources/example5 would do what i am trying to do but i can't get it to run with this new repository.
03:14:40.975 [main] ERROR be.ugent.rml.cli.Main .main(179) - Unable to parse mapping rules as Turtle. Does the file exist and is it valid Turtle?
Hi Justin,
The reason joining with rr:child "items[*].id"
does not work is that the reference "items[*].id"
will return multiple values (the id
of each item
). A join condition expects reference that returns only one value.
You can however use the "items[*].id"
reference without a join condition to get the required result. Namely, you can use it directly to create the objects of the schema:contains
predicate:
:TriplesMapItems rr:predicateObjectMap [
rr:predicate schema:contains;
rr:objectMap [ rr:template "http://characters.com/items/{items[*].id}" ]
].
Which gives the desired output:
<http://characters.com/posessions/char/0/possessions/> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Collection>.
<http://characters.com/posessions/char/0/possessions/> <http://schema.org/contains> <http://characters.com/items/10>.
<http://characters.com/posessions/char/0/possessions/> <http://schema.org/contains> <http://characters.com/items/11>.
<http://characters.com/posessions/char/1/possessions/> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Collection>.
<http://characters.com/posessions/char/1/possessions/> <http://schema.org/contains> <http://characters.com/items/12>.
<http://characters.com/posessions/char/1/possessions/> <http://schema.org/contains> <http://characters.com/items/13>.
<http://characters.com/posessions/char/1/possessions/> <http://schema.org/contains> <http://characters.com/items/14>.
The full mappings I used as well as their input and output are attached:
hi @thomas-delva , thanks! that does work as long as i have a unique identifier for every object in a child array.
but if my input was:
{
"characters": [
{
"id": "0",
"firstname": "Ash",
"items":[ {"name":"gloves", "weight":340},
{"name":"sword", "weight":44400}
]
},
{
"id": "1",
"firstname": "Misty",
"items":[ {"name":"gloves", "weight":340},
{"name":"mittens", "weight":300},
{"name":"hat", "weight":800}
]
}
]
}
and if i want this output:
<http://characters.com/posessions/char/0/possessions/> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Collection>.
<http://characters.com/posessions/char/0/possessions/> <http://schema.org/contains> _:b0 .
<http://characters.com/posessions/char/0/possessions/> <http://schema.org/contains> _:b1 .
<http://characters.com/posessions/char/1/possessions/> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Collection>.
<http://characters.com/posessions/char/1/possessions/> <http://schema.org/contains> _:b2 .
<http://characters.com/posessions/char/1/possessions/> <http://schema.org/contains> _:b3 .
<http://characters.com/posessions/char/1/possessions/> <http://schema.org/contains> _:b4 .
_:b0 <http://schema.org/weight> "340".
_:b0 <http://schema.org/name> "gloves".
_:b1 <http://schema.org/weight> "44400".
_:b1 <http://schema.org/name> "sword".
_:b2 <http://schema.org/weight> "340".
_:b2 <http://schema.org/name> "gloves".
_:b3 <http://schema.org/weight> "300".
_:b3 <http://schema.org/name> "mittens".
_:b4 <http://schema.org/weight> "800".
_:b4 <http://schema.org/name> "hat".
how can i achieve that?
The case without a single identifier field for nested items in the child array is currently not possible with RML and JSONPath.
We will update this issue once a solution becomes available within RML.
but it seems like this old example is doing something like that.
picks out the appropriate objects (the nested children) without referencing the template URI: https://github.com/RMLio/RML-Processor/blob/47644026c41f8a7da3a80a63907bb3040402805b/src/test/resources/example5/museum-model.rml.ttl#L75
the old repository allowed
rml:iterator "$.[*].Sitter";
even though Sitter is an array.
but with this new repo it seems like you must:
rml:iterator "$.[*].Sitter[*]";
but then you get a cartesian product output which isn't desirable:
<http://ex.com/Neil%20Armstrong> <http://www.w3.org/2000/01/rdf-schema#label> "Neil Armstrong".
<http://ex.com/Buzz%20Aldrin> <http://www.w3.org/2000/01/rdf-schema#label> "Buzz Aldrin".
<http://ex.com/Michael%20Collins> <http://www.w3.org/2000/01/rdf-schema#label> "Michael Collins".
<http://ex.com/Neil%20Armstrong> <http://www.w3.org/2000/01/rdf-schema#label> "Neil Armstrong".
<http://ex.com/Henry%20Larcom%20Abbot> <http://www.w3.org/2000/01/rdf-schema#label> "Henry Larcom Abbot".
<http://ex.com/NPG_70_36> <http://example.org/example/P62_depicts> <http://ex.com/Neil%20Armstrong>.
<http://ex.com/NPG_70_36> <http://example.org/example/P62_depicts> <http://ex.com/Buzz%20Aldrin>.
<http://ex.com/NPG_70_36> <http://example.org/example/P62_depicts> <http://ex.com/Michael%20Collins>.
<http://ex.com/NPG_70_36> <http://example.org/example/P62_depicts> <http://ex.com/Neil%20Armstrong>.
<http://ex.com/NPG_70_36> <http://example.org/example/P62_depicts> <http://ex.com/Henry%20Larcom%20Abbot>.
<http://ex.com/NPG_70_36> <http://example.org/example/P102_has_title> "Apollo 11 Crew".
<http://ex.com/NPG_70_36> <http://example.org/example/P48_has_preferred_identifier> "NPG_70_36".
<http://ex.com/S_NPG_2010_51> <http://example.org/example/P62_depicts> <http://ex.com/Neil%20Armstrong>.
<http://ex.com/S_NPG_2010_51> <http://example.org/example/P62_depicts> <http://ex.com/Buzz%20Aldrin>.
<http://ex.com/S_NPG_2010_51> <http://example.org/example/P62_depicts> <http://ex.com/Michael%20Collins>.
<http://ex.com/S_NPG_2010_51> <http://example.org/example/P62_depicts> <http://ex.com/Neil%20Armstrong>.
<http://ex.com/S_NPG_2010_51> <http://example.org/example/P62_depicts> <http://ex.com/Henry%20Larcom%20Abbot>.
<http://ex.com/S_NPG_2010_51> <http://example.org/example/P102_has_title> "Neil Armstrong".
<http://ex.com/S_NPG_2010_51> <http://example.org/example/P48_has_preferred_identifier> "S_NPG_2010_51".
<http://ex.com/NPG_92_127> <http://example.org/example/P62_depicts> <http://ex.com/Neil%20Armstrong>.
<http://ex.com/NPG_92_127> <http://example.org/example/P62_depicts> <http://ex.com/Buzz%20Aldrin>.
<http://ex.com/NPG_92_127> <http://example.org/example/P62_depicts> <http://ex.com/Michael%20Collins>.
<http://ex.com/NPG_92_127> <http://example.org/example/P62_depicts> <http://ex.com/Neil%20Armstrong>.
<http://ex.com/NPG_92_127> <http://example.org/example/P62_depicts> <http://ex.com/Henry%20Larcom%20Abbot>.
<http://ex.com/NPG_92_127> <http://example.org/example/P102_has_title> "Henry Larcom Abbot".
<http://ex.com/NPG_92_127> <http://example.org/example/P48_has_preferred_identifier> "NPG_92_127".
i used this stripped down mapping:
@prefix rml: <http://semweb.mmlab.be/ns/rml#> .
@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix ql: <http://semweb.mmlab.be/ns/ql#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <http://example.org/rules/> .
@prefix ex: <http://example.org/example/> .
@prefix schema: <http://schema.org/> .
@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
:SitterMapping a rr:TriplesMap ;
rml:logicalSource [
rml:source "src/test/resources/example5/museum.json";
rml:referenceFormulation ql:JSONPath;
rml:iterator "$.[*].Sitter[*]";
];
rr:subjectMap [
rr:template "http://ex.com/{Name}" ; ];
rr:predicateObjectMap
[
rr:predicate rdfs:label;
rr:objectMap
[
rml:reference "Name"
]
].
:ArtworkMapping a rr:TriplesMap ;
rml:logicalSource [
rml:source "src/test/resources/example5/museum.json";
rml:referenceFormulation ql:JSONPath;
rml:iterator "$.[*]" ] ;
rr:subjectMap [
rr:template "http://ex.com/{Ref}";
];
rr:predicateObjectMap
[
rr:predicate ex:P102_has_title;
rr:objectMap
[
rml:reference "Title"
]
];
rr:predicateObjectMap
[
rr:predicate ex:P48_has_preferred_identifier;
rr:objectMap
[
rml:reference "Ref"
]
];
rr:predicateObjectMap
[
rr:predicate ex:P62_depicts;
rr:objectMap [
rr:parentTriplesMap :SitterMapping ;
];
].
if it would help to see an implementation reference: JSON2RDF is able to do this conversion in a "lossless [manner] with the exception of array ordering and some datatype round-tripping."
it produced these triples:
[ <https://example.com/#characters>
[ <https://example.com/#firstname>
"Misty" ;
<https://example.com/#id> "1" ;
<https://example.com/#items> [ <https://example.com/#name> "hat" ;
<https://example.com/#weight> "800"^^<http://www.w3.org/2001/XMLSchema#int>
] ;
<https://example.com/#items> [ <https://example.com/#name> "mittens" ;
<https://example.com/#weight> "300"^^<http://www.w3.org/2001/XMLSchema#int>
] ;
<https://example.com/#items> [ <https://example.com/#name> "gloves" ;
<https://example.com/#weight> "340"^^<http://www.w3.org/2001/XMLSchema#int>
]
] ;
<https://example.com/#characters>
[ <https://example.com/#firstname>
"Ash" ;
<https://example.com/#id> "0" ;
<https://example.com/#items> [ <https://example.com/#name> "sword" ;
<https://example.com/#weight> "44400"^^<http://www.w3.org/2001/XMLSchema#int>
] ;
<https://example.com/#items> [ <https://example.com/#name> "gloves" ;
<https://example.com/#weight> "340"^^<http://www.w3.org/2001/XMLSchema#int>
]
]
] .
note the resultant triples respect the json ancestry so it is clear that misty's gloves and ash's gloves aren't necessarily the same thing.
though i don't think that repo uses jsonpath it instead uses JsonParser so maybe it isn't useful as a reference.
Hi Justin,
You correctly noticed the current behavior deviates from the older example. The issue with handling nested source data has been identified before by the broader RML community. We raised this issue as as a challenge in the community group and we expect that solutions will be proposed by the end of the month. Once we have our solution, we will notify you to check.
thanks for the links, @thomas-delva .
in the mean time i'm going to use bob's approach described here: http://www.bobdc.com/blog/partialschemas/
though i would prefer to use RML.
if anyone else has a similar json -> triples need: https://github.com/justin2004/rml-testing
note that i am not concerned about using correct properties yet. i am just concerned with structure.
characters.json:
rml:
which produces the triples i expect:
but notice i had to hardcode to the 0th item:
i can manually iterate like:
and
and it works as expected.
but if i want to do them all at once:
then i lose the ?s http://schema.org/contains ?o matching triples: