RMLio / rmlmapper-java

The RMLMapper executes RML rules to generate high quality Linked Data from multiple originally (semi-)structured data sources
http://rml.io
MIT License
146 stars 61 forks source link

Turtle: Missing Whitespace before Semicolon in Predicate Lists? #188

Closed tobiasschweizer closed 1 year ago

tobiasschweizer commented 1 year ago

Hi there,

I ran into a problem loading RDF serialised as Turtle (parsing exception). I figured that the problem was missing whitespace before a semicolon in a predicate list.

Example:

<result source="corda" type="relatedResult">

    <availableLanguages readOnly="true">en</availableLanguages>

    <rcn>4321</rcn>

    <id>1243</id>

    <title>Some title</title>

    <details>

        <authors><author>A Bergström</author>
        </authors>

        <journalTitle>Some Journal</journalTitle>

        <journalNumber>31/12</journalNumber>

        <publisher>Some Academic Publishers</publisher>

        <publishedYear>2016</publishedYear>

        <publishedPages>1243-1264</publishedPages>

    </details>

</result>
PREFIX rr: <http://www.w3.org/ns/r2rml#>
PREFIX rml: <http://semweb.mmlab.be/ns/rml#>
PREFIX ql: <http://semweb.mmlab.be/ns/ql#>
PREFIX carml: <http://carml.taxonic.com/carml/>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX premis: <http://id.loc.gov/vocabulary/preservation/>
PREFIX schema: <http://schema.org/>
@base <http://example.com/ns#>.

<#LogicalSourceArticle> a rml:BaseSource ;
    rml:source "test.xml";
    rml:referenceFormulation ql:XPath ;
    rml:iterator "/result[@type='relatedResult']" .

<#article> a rr:TriplesMap ;
    rml:logicalSource <#LogicalSourceArticle> ;

    rr:subjectMap [
        rr:template "https://data.connectome.ch/publication/{id}" ;
        rr:class schema:ScholarlyArticle ;
    ] ;

    rr:predicateObjectMap [
        rr:predicate schema:name ;
        rr:objectMap [
            rml:reference "title" ;
        ];
    ] ;

    rr:predicateObjectMap [
        rr:predicate schema:identifier ;
        rr:objectMap [
            rml:reference "id" ;
        ];
    ] .

Output:

@prefix schema: <http://schema.org/> .

<https://data.connectome.ch/publication/1243> a schema:ScholarlyArticle; # no whitespace before semicolon
  schema:identifier "1243"; # no whitespace before semicolon
  schema:name "Some title" . # whitespace before `.`

Command: java -jar rmlmapper-6.0.0-r363-all.jar -m mapping.ttl -s turtle

I noticed that the . are preceded with whitespace while the semicolons (predicate lists) are not. However, In the examples in the docs they are as well:

http://example.org/#spiderman http://www.perceive.net/schemas/relationship/enemyOf http://example.org/#green-goblin ; http://xmlns.com/foaf/0.1/name "Spiderman" .

Should the Turtle serialiser put a whitespace char before the semicolons? Thanks for your feedback.

bjdmeest commented 1 year ago

Normally, whitespace or not before semicolon shouldn't matter to be valid turtle, and we make use of RDF4j for the serialization, so I'm afraid that's a bit out of our control.

tobiasschweizer commented 1 year ago

No worries, I think it is rather an issue of the parser that should be able to deal with that.