RMLio / RML-Mapper

Generate High Quality Linked Data from multiple originally (semi-)structured data (legacy)
http://RML.io
52 stars 20 forks source link

problems with remote rml:source #4

Closed seralf closed 8 years ago

seralf commented 8 years ago

Hi

I'm testing examples in the RML-Processor repo before the actual development. Using json files downloaded locally, everything runs well. On the other hand, if I try to use the same data directly from an url, the url it's used as it was a local path for constructing a file.

For example I'm getting this error:

672 [main] ERROR LocalFileProcessor  - IO Exception: java.io.FileNotFoundException: /MY_PATH/http:/ewi.mmlab.be/cd/raw/ingelmunster/persoon.json (File o directory non esistente)
672 [main] DEBUG StdRMLEngine  - Null input data derived from http://ewi.mmlab.be/cd/raw/ingelmunster/persoon.json
Generating RDF triples for file:/MY_PATH/ex01/ex01_mapping.rml.ttl#PersonMappingCSV

using the mapping:

<#PersonMappingJSON>
    rml:logicalSource [
        #rml:source "input/persoon.json" ;
        rml:source "http://ewi.mmlab.be/cd/raw/ingelmunster/persoon.json" ;
        rml:referenceFormulation ql:JSONPath ;
        rml:iterator "$.[*]" 
    ] ;

    rr:subjectMap [
        rr:template "http://example.org/Person/{Naam}_{Voornaam}";
        rr:class ex:Person 
    ] ;
.

and the command line:

$ java -jar $rml_dir/RML-Processor-0.3.jar -m testing.rml.ttl -o EXPORT/test_output.nt

I've tried to explain the problem in short, so I hope I was clear in the explanation-

Thank you in advance for any suggestion.

andimou commented 8 years ago

RML expects "proper" source descriptions, where proper it means using different vocabularies which are meant to advertise how to access a data source. See http://rml.io/RMLdataRetrieval.html or the corresponding publication or unitests from example 9 on.

seralf commented 8 years ago

Hi thank you for the reply. Nice work with the source, I had completely missed the point here, sorry.

I'm trying then to construct a really small example based on a proper hydra description of an existing API. At the moment I'm stucked on something which should be at least syntactically correct, but actually gives me no triples in the result :-) So I'm probably still missing something.

I'll paste here a small snippet, in order to have something shared on which it's more clear to comment.

@prefix rr:     <http://www.w3.org/ns/r2rml#>.
@prefix rml:    <http://semweb.mmlab.be/ns/rml#> .
@prefix ql:     <http://semweb.mmlab.be/ns/ql#> .
@prefix rdfs:   <http://www.w3.org/2000/01/rdf-schema#>.
@prefix adms:   <http://www.w3.org/ns/adms#>.
@prefix skos:   <http://www.w3.org/2004/02/skos/core#> .
@prefix vcard:  <http://www.w3.org/2006/vcard/ns#> .
@prefix dcterms:<http://purl.org/dc/terms/> .
@prefix schema: <http://schema.org/>.
@prefix person: <http://www.w3.org/ns/person#>.
@prefix rov:    <http://www.w3.org/ns/regorg#>.
@prefix locn:   <http://www.w3.org/ns/locn#>.
@prefix org:    <http://www.w3.org/ns/org#>.
@prefix foaf:   <http://foaf.org/>.
@prefix ex:    <http://example.org/data/>.
@prefix hydra : <http://www.w3.org/ns/hydra/core#> .
@prefix rml : <http://semweb.mmlab.be/rml#> .
@prefix d2rq : <http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#> .

<#API_template_source> a hydra:IriTemplate ;
hydra:template "https://biblio.ugent.be/publication/{?id}?format={?format}";
hydra:mapping [
a hydra:TemplateMapping ;
hydra:variable "id";
hydra:required false
],
[
a hydra:TemplateMapping ;
hydra:variable "format";
hydra:required true
];
.

<#TriplesItem>

rml:logicalSource [
rml:source <#API_template_source> ;
rml:referenceFormulation ql:JSONPath ;
rml:iterator "$.[*]"
] ;

rr:subjectMap [
rr:template "http://example.org/Article/{id}";
rr:class ex:Article
] ;

rr:predicateObjectMap [
rr:predicate foaf:account;
rr:objectMap [
rml:bindCondition [
# maybe ? rml:constant "6956293" ;
rml:reference "_id" ;
rml:condition "id"
],
[ rr:constant "json" ; rml:condition "format" ]
] ;
] ;
.

I've tried also the mapping starting with no variables at all, from the entire list (https://biblio.ugent.be/publication?format=json), but still I had no errors in the syntax, but no results as well.

Can you give me some suggestions on that? Once that I've understand how to start gathering data from a list in json I think I should be able to create multiple mappings for the various resources accordingly: my ami is to create a working example, so I'll be able to start from it on our API.

thank you in advance!

Alfredo

2016-07-17 9:03 GMT+02:00 andimou notifications@github.com:

RML expects "proper" source descriptions, where proper it means using different vocabularies which are meant to advertise how to access a data source. See http://rml.io/RMLdataRetrieval.html or the corresponding publication http://dl.acm.org/citation.cfm?id=2814873 or unitests from example 9 on.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/RMLio/RML-Mapper/issues/4#issuecomment-233168619, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFYfAh-WFVJCL7jovNdTIZu-63ktPYtks5qWdOqgaJpZM4JOHvk .

seralf commented 8 years ago

I'll reply to myself with a small update/correction, in case it could be useful for others too.

Seems like the problem was related to an error inthe path selection criteria. For example the following works for me for retrieving information from the list:

# prefixes ...

<#Input> a hydra:IriTemplate ;
    hydra:template "https://biblio.ugent.be/publication?format=json" .

<#PaperAuthor>
    rml:logicalSource [ 
        rml:source <#Input>;
        rml:referenceFormulation ql:JSONPath;
        rml:iterator "$.hits[*].author[*]";
    ] ;

    rr:subjectMap [
        rr:template "http://www.ex.com/paper/{_id}"; 
    ] ;

    rr:predicateObjectMap [
        rr:predicate ex:author ;
        rr:objectMap [ 
            rml:reference "name" ;
        ]
    ] ;
    .

starting from that, I will explore mapping specific sub-path using parametrizations.