RMLio / yarrrml-parser

A YARRRML parser library and CLI in Javascript
MIT License
41 stars 17 forks source link

warn in case language: is given and datatype: is given but DIFFERENT FROM rdf:langString (and don't warn in case language: is given and datatype: is given and EQUAL TO rdf:langString) #170

Open mvanbrab opened 2 years ago

mvanbrab commented 2 years ago

Issue type: :unicorn: Feature

Description

In my use case the YARRRML file is generated automatically, and always providing a datatype. For that purpose, my earlier feature request #160 was already implemented in v1.3.5. The new implementation gives a warning for that case:

Datatype http://www.w3.org/1999/02/22-rdf-syntax-ns#langString is ignored when combined with language tag (nl).

That warning is rather useless and appears very frequently in my use case.

On the other hand, when another datatype (such as xsd:string) is specified in combination with a language tag, there is no warning. The behaviour of the parser in this case is to ignore the language tag and keep the datatype, which is OK. However, in this case, a warning like this one would be appreciated:

Language tag (nl) is ignored when combined with datatype http://www.w3.org/2001/XMLSchema#string.

Summary: add second warning, remove first one.

Why it is useful

Useful when debugging missing language tags in mapper output, in the use case of automatically generated YARRRML files.

Existing features it breaks

None.

Example data

YARRRML input file:

prefixes:
  ex: http://www.example.com/

sources:
  src:
    access: issue.csv
    referenceFormulation: csv
    encoding: utf-8

mappings:
  issue:
    sources:
      - src
    s: ex:$(id)
    po:
      - p: ex:allowed
        o:
          value: this is a good combination of language and datatype
          language: nl
          datatype: rdf:langString
      - p: ex:not-allowed
        o:
          value: this is a bad combination of language and datatype
          language: nl
          datatype: xsd:string

RML output file (is OK):

@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix fnml: <http://semweb.mmlab.be/ns/fnml#>.
@prefix fno: <https://w3id.org/function/ontology#>.
@prefix d2rq: <http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#>.
@prefix void: <http://rdfs.org/ns/void#>.
@prefix dc: <http://purl.org/dc/terms/>.
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix : <http://mapping.example.com/>.
@prefix ex: <http://www.example.com/>.

:rules_000 a void:Dataset.
:source_000 a rml:LogicalSource;
    rdfs:label "src";
    rml:source "issue.csv";
    rml:referenceFormulation ql:CSV.
:rules_000 void:exampleResource :map_issue_000.
:map_issue_000 rml:logicalSource :source_000;
    a rr:TriplesMap;
    rdfs:label "issue".
:s_000 a rr:SubjectMap.
:map_issue_000 rr:subjectMap :s_000.
:s_000 rr:template "http://www.example.com/{id}".
:pom_000 a rr:PredicateObjectMap.
:map_issue_000 rr:predicateObjectMap :pom_000.
:pm_000 a rr:PredicateMap.
:pom_000 rr:predicateMap :pm_000.
:pm_000 rr:constant ex:allowed.
:pom_000 rr:objectMap :om_000.
:om_000 a rr:ObjectMap;
    rr:constant "this is a good combination of language and datatype";
    rr:termType rr:Literal;
    rml:languageMap :language_000.
:language_000 rr:constant "nl".
:pom_001 a rr:PredicateObjectMap.
:map_issue_000 rr:predicateObjectMap :pom_001.
:pm_001 a rr:PredicateMap.
:pom_001 rr:predicateMap :pm_001.
:pm_001 rr:constant ex:not-allowed.
:pom_001 rr:objectMap :om_001.
:om_001 a rr:ObjectMap;
    rr:constant "this is a bad combination of language and datatype";
    rr:termType rr:Literal;
    rr:datatype <http://www.w3.org/2001/XMLSchema#string>.
pheyvaer commented 2 years ago

For me it's fine to add a new warning, but I would not remove the current one because the fact that it's useless to you personally is not a valid reason to remove it.

mvanbrab commented 2 years ago

It's useless to everybody, right?

pheyvaer commented 2 years ago

How do you know that?

mvanbrab commented 2 years ago

Can you explain the added value of that warning anyway? That situation doesn't lead anyone to unexpected mapping results; it's just telling you that you're over-specifying. If the new warning, signalling a situation that does lead to unexpected mapping results, is added, it will probably not be noticed, between all these other warnings.

pheyvaer commented 2 years ago

It explains to the user that they are doing something redundant, while they might think that it's needed that they specify the datatype when they specify the language. I can understand that we put this as "info" instead of "warning".

mvanbrab commented 2 years ago

That would be a good compromise indeed. To be effective, one would need a command line switch that can set the logging level. This is not available at this moment.