zazuko / xrm

A friendly language for mappings to RDF
MIT License
1 stars 0 forks source link

Support for using TemplateValuedTerm for literals (rr:termType) #30

Closed mchlrch closed 4 years ago

mchlrch commented 4 years ago

By default, rr:template generates IRIs [1]. If a literal is to be created instead, then the term type [2] has to be set.

1: https://www.w3.org/TR/r2rml/#from-template 2: https://www.w3.org/TR/r2rml/#termtype

DSL: Currently, the DSL doesn't support choosing the termType.

Generator: Currently, the generator doesn't write rr:termType, so the result relies on the default behavior described in the spec [2].

Example R2RML output for producing a literal based on a template:

rr:predicateObjectMap [
    rr:predicate skos:notation ;
    rr:objectMap [
        rr:template "{CODE}" ;
        rr:termType rr:Literal;
    ];
]
nicky508 commented 4 years ago

Implemented termtypes for subjects and predicated objectmaps. for referenced and templated value maps:

Referenced value map: schema.value from vector_dbo with datatype xsd.int as Literal;

Templated value map: bot.containsZone template "http://data.monroein.firegraph.store/data/fd/objects/{0}/extFeature/{1}" with pin_18 gid as IRI;

Subjectmap: subject template "http://data.monroein.firegraph.store/data/fd/objects/{0}" with pin_18 as BlankNode;

It is a bit unclear whether a subjectmap could only have a blanknode termtype without having the template. So it varies among implementation how it is implemented. The RMLio implementation ables it to have only the termType with blanknode (so does our one implementation). CARML needs in addition to the termtype blanknode also a template, because it uses the template to create a blanknode.

mchlrch commented 4 years ago

@nicky508 Is the implementation of this feature finished?

It is a bit unclear whether a subjectmap could only have a blanknode termtype without having the template. So it varies among implementation how it is implemented.

Is there a problem with any of the implementations if there is a template and blanknode termtype? Is it a problem that the DSL currently requires a template?

mchlrch commented 4 years ago

@nicky508 I saw you added two enums in the grammar. Why do we need both of them?

enum TermTypeEnum:
 Unspecified="unspecified" | Literal='Literal' | IRI='IRI' | BlankNode='BlankNode';

enum TermType returns TermTypeEnum:
 Literal='Literal' | IRI='IRI' | BlankNode='BlankNode';
nicky508 commented 4 years ago

@nicky508 Is the implementation of this feature finished?

It is a bit unclear whether a subjectmap could only have a blanknode termtype without having the template. So it varies among implementation how it is implemented.

Is there a problem with any of the implementations if there is a template and blanknode termtype? Is it a problem that the DSL currently requires a template?

No, It will work in any implementation I guess. But take into account that the behaviour could be different.

nicky508 commented 4 years ago

@nicky508 I saw you added two enums in the grammar. Why do we need both of them?

enum TermTypeEnum:
 Unspecified="unspecified" | Literal='Literal' | IRI='IRI' | BlankNode='BlankNode';

enum TermType returns TermTypeEnum:
 Literal='Literal' | IRI='IRI' | BlankNode='BlankNode';

Yes, I added two Enums. The problem of using enums is that when the enum is not used in the mapping it always retrieves the first value of it. After some research I found this construction which makes sure, it retrieves unspecified when generating: http://blog.dietmar-stoll.de/2013/11/default-enum-literals-for-xtext.html

mchlrch commented 4 years ago

I integrated this feature in master. In the grammar I used the TermTypeRef wrapper-object pattern instead of the two enums, this is cleaner in the grammar and also the generator templates turn out more idiomatic.

Sample DSL, using as Literal:

map Sector from easygov_sectors {
    subject template "https://permits.zazukoians.org/sectors/{0}" with CODE;

    types bdb.Sector skos.Concept;

    properties
        skos.notation template "{0}" with CODE as Literal;
}

Sample R2RML output:

<#Sector>
    rr:logicalTable [ rr:tableName "easygov-sectors.csv" ];

    rr:subjectMap [
        rr:template "https://permits.zazukoians.org/sectors/{CODE}";
        rr:class bdb:Sector ;
        rr:class skos:Concept ;
    ];

    rr:predicateObjectMap [
        rr:predicate skos:notation ;
        rr:objectMap [
            rr:template "{CODE}" ;
            rr:termType rr:Literal ;
        ];
    ]
.
mchlrch commented 4 years ago

I'm leaving this issue open. According to the spec

The value must be an IRI and must be one of the following options:

  • If the term map is a subject map: rr:IRI or rr:BlankNode

The DSL currently allows to assign as Literal to the subject. In order to have the generated output compliant with the spec, this should be validated and excluded from proposal.

nnamtug commented 4 years ago

I am not 100% sure to have completely understood the open point.

Assumption: Validate not using a literal when in a subject map (therefore csv). If this is the case, this was implemented meanwhile, please see:

https://github.com/zazuko/rdf-mapping-dsl/blob/bda63d6d3eb6e42164ad09f69b00973311679a9f/runtime-EclipseXtext/editor-test/editortest-csv.xrm#L27

https://github.com/zazuko/rdf-mapping-dsl/blob/bda63d6d3eb6e42164ad09f69b00973311679a9f/runtime-EclipseXtext/editor-test/editortest-csv.xrm#L29

The validation throws a warning, not an error for now. The data-set seco-bdb still contains such declarations (also see warnings in the inner eclipse):

https://github.com/zazuko/rdf-mapping-dsl/blob/bda63d6d3eb6e42164ad09f69b00973311679a9f/runtime-EclipseXtext/seco-bdb/professions-mapping.xrm#L14

https://github.com/zazuko/rdf-mapping-dsl/blob/bda63d6d3eb6e42164ad09f69b00973311679a9f/runtime-EclipseXtext/seco-bdb/sectors-mapping.xrm#L15

@mchlrch Please check - does the existing validation cover the requirement?

mchlrch commented 4 years ago

The open point is that as Literal on the subject should always be invalid. Only the termtypes BlankNode or IRI are valid for the subject.

In the following mapping, validation should throw error on Literal.

output rml

map EmployeeMapping from EMPLOYEE {
    subject template "http://airport.example.com/{0}" with id as Literal;

And code assist should not propose Literal on (x), but only BlankNode and IRI:

output rml

map EmployeeMapping from EMPLOYEE {
    subject template "http://airport.example.com/{0}" with id as (x);
mchlrch commented 4 years ago

I tested this and it works as expected, thanks!