jpmccu / sadi

Automatically exported from code.google.com/p/sadi
0 stars 0 forks source link

sadi.py fails to parse valid Turtle #12

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
A turtle file [1] that rapper is happy with fails to load when sent to sadi.py. 
After quite a bit of attempting to satisfy sadi.py (and rdflib), I've retreated 
to dumbing down the input to RDF/XML.

== rapper happy ==

rapper -g -c 
add-metadata-materials/sample-inputs/congresspeople-tagged-government.ttl
rapper: Guessed parser name 'turtle'
rapper: Parsing returned 45 triples

== rdflib sad ==

curl -H "Content-Type: text/turtle"         -d 
@add-metadata-materials/sample-inputs/congresspeople-tagged-government.ttl 
http://localhost:9090/add-metadata

rdflib.plugins.parsers.notation3.BadSyntax: at line 1 of <>:
Bad syntax (EOF found when expected verb in property list) at ^ in:
"...atafaqs#> .<http://dsi.lod-cloud.net/dataset/congresspeople>^ 
#<http://thedatahub.org/dataset/congresspeople>    a datafa..."

[1] 
https://github.com/timrdf/DataFAQs/blob/master/services/sadi/ckan/add-metadata-m
aterials/sample-inputs/congresspeople-tagged-government.ttl

[2] 
https://github.com/timrdf/DataFAQs/blob/master/services/sadi/ckan/add-metadata.r
py

test of turtle file:
---------------------
@prefix dcterms:  <http://purl.org/dc/terms/>           .
@prefix foaf:     <http://xmlns.com/foaf/0.1/>          .
@prefix sioc:     <http://rdfs.org/sioc/ns#>            .
@prefix ov:       <http://open.vocab.org/terms/>        .
@prefix void:     <http://rdfs.org/ns/void#>            .
@prefix moat:     <http://moat-project.org/ns#>         .
@prefix datafaqs: <http://purl.org/twc/vocab/datafaqs#> .

<http://dsi.lod-cloud.net/dataset/congresspeople> 
#<http://thedatahub.org/dataset/congresspeople>
    a datafaqs:CKANDataset ;
   datafaqs:namespace <http://logd.tw.rpi.edu/source/contactingthecongress/dataset/directory-for-the-112th-congress>;
   foaf:isPrimaryTopicOf <http://dsi.lod-cloud.net/dataset/congresspeople>;
   dcterms:identifier "f4c2a8bb-6580-4919-98aa-617feb766b6c";

   ov:shortName       "congresspeople";
   a ov:DigitalAsset;

   a datafaqs:TaggedCKANDataset;
   moat:taggedWithTag <http://lod-cloud.net/tag/government>;
   a sioc:Item;

   void:vocabulary
                   <http://www.w3.org/2002/07/owl#>,
                   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>,
                   <http://www.w3.org/2000/01/rdf-schema#>,
                   <http://purl.org/dc/terms/>,
                   <http://xmlns.com/foaf/0.1/>,
                   <http://www.w3.org/2000/10/swap/pim/contact#>,
                   <http://dbpedia.org/property/>,
                   <http://dbpedia.org/ontology/>,
                   <http://rdfs.org/ns/void#>,
                   <http://open.vocab.org/terms/>,
                   <http://purl.org/vocab/vann/>,
                   <http://usefulinc.com/ns/doap#>,
                   <http://purl.org/NET/scovo#>,
                   <http://purl.org/twc/vocab/conversion/>,
                   <http://inference-web.org/2.0/pml-provenance.owl#>,
                   <http://inference-web.org/2.0/pml-justification.owl#>,
                   <http://logd.tw.rpi.edu/source/contactingthecongress/dataset/directory-for-the-112th-congress/vocab/>,
                   <http://logd.tw.rpi.edu/source/contactingthecongress/dataset/directory-for-the-112th-congress/vocab/enhancement/1/>;
   a void:Dataset;

   # links:dbpedia 67
   void:subset [
      a void:Linkset;
      void:target <http://dsi.lod-cloud.net/dataset/congresspeople>, 
                  <http://thedatahub.org/dataset/dbpedia> ;
      # http://dbpedia.org/resource
      void:triples 67;
   ];

   # links:geonames-semantic-web 50
   void:subset [
      a void:Linkset;
      void:target <http://dsi.lod-cloud.net/dataset/congresspeople>, 
                  <http://thedatahub.org/dataset/geonames-semantic-web> ;
      # http://sws.geonames.org
      void:triples 50;
   ];

   # links:govtrack 56
   void:subset [
      a void:Linkset;
      void:target <http://dsi.lod-cloud.net/dataset/congresspeople>, 
                  <http://thedatahub.org/dataset/govtrack> ;
      # http://www.rdfabout.com/rdf/usgov
      void:triples 56;
   ];

   dcterms:isPartOf <http://ckan.net/group/datafaqs>;
.

<http://ckan.net/group/datafaqs> a datafaqs:CKANGroup .

Original issue reported on code.google.com by tim...@gmail.com on 21 Dec 2011 at 5:02

GoogleCodeExporter commented 9 years ago
Setting the product tag to Product-PythonAPI instead of Product-JavaAPI as this 
is a Python issue and not a Java one.

Original comment by elmccar...@gmail.com on 21 Dec 2011 at 5:11

GoogleCodeExporter commented 9 years ago
We've narrowed it down to the "indented comments" such as:

   # links:dbpedia 67
   void:subset [
      a void:Linkset;
      void:target <http://dsi.lod-cloud.net/dataset/congresspeople>,
                  <http://thedatahub.org/dataset/dbpedia> ;
      # http://dbpedia.org/resource
      void:triples 67;
   ];

(it could also be a newline problem - the rest of the file is a comment...)

Original comment by tim...@gmail.com on 21 Dec 2011 at 5:53

GoogleCodeExporter commented 9 years ago
According to http://stackoverflow.com/a/7241617/438254 curl with -d will have a 
tendency to strip newlines. I've added positive and negative tests to tests.py 
for this, after having narrowed it down. 

Original comment by mccus...@gmail.com on 26 Dec 2013 at 7:31