ckan / ckanext-dcat

CKAN ♥ DCAT
163 stars 142 forks source link

RDF turtle not being parsed when containing return codes #63

Closed ghost closed 8 years ago

ghost commented 8 years ago

Having problems processing DCAT type data. Example DCAT data source This data contains a "description" property that contains a return code. The return code cause processing to fail. Using this file taking out the return codes works perfectly.

amercader commented 8 years ago

This is the underlying RDFLib parser complaining about bad syntax:

Traceback (most recent call last):
  File "ckanext/dcat/processors.py", line 367, in <module>
    parser.parse(contents, _format=args.format)
  File "ckanext/dcat/processors.py", line 146, in parse
    raise RDFParserException(e)
__main__.RDFParserException: at line 15 of <>:
Bad syntax (newline found in string literal) at ^ in:
"...ems: Status, Trends, Pressures, and Conservation Priorities.^
BioFresh is an EU-funded international project that aims to..."

If you want to inlcude line breaks on your ttl files you need to use long literals, ie three double quotes (""") on dct:description:

@prefix adms: <http://www.w3.org/ns/adms#> .
@prefix schema: <http://schema.org/> .
...

<http://data.freshwaterbiodiversity.eu/ipt#Catalog>
 a dcat:Catalog ;
dct:title "BioFresh" ;
dct:description """Central IPT installation for the BioFresh consortium, initiated under the EU FP7 project BioFresh: Biodiversity of Freshwater Ecosystems: Status, Trends, Pressures, and Conservation Priorities.^M
BioFresh is an EU-funded international project that aims to build a global information platform for scientists and ecosystem managers with access to all available databases describing the distribution, status and trends of global freshwater biodiversity. BioFresh integrates the freshwater biodiversity competencies and expertise of 19 research institutions.""" ;

...

See Section 2.1 on https://www.w3.org/TeamSubmission/turtle/