freme-project / e-Link

Apache License 2.0
0 stars 0 forks source link

Pipelining e-Entity + e-Link, the second service fails #69

Closed andish closed 7 years ago

andish commented 8 years ago

Request:

POST /current/pipelining/chain HTTP/1.1
Host: api.freme-project.eu
Content-Type: application/json

[
    {
      "method": "POST",
  "endpoint": "http://api.freme-project.eu/current/e-entity/freme-ner/documents",
  "parameters": {
    "language": "en",
    "dataset" :"dbpedia"
  },
  "headers": {
    "content-type": "text/plain",
    "accept": "text/turtle"
  },
  "body": "France and Germany came up with a plan"
    },
    {
  "method": "POST",
  "endpoint": "http://api.freme-project.eu/current/e-link/documents/",
  "parameters": {
    "templateid": "2"
  },
  "headers": {
    "content-type": "text/turtle",
    "accept": "text/turtle"
  }
}
  ]

Response

{
  "exception": "eu.freme.common.exception.BadRequestException",
  "path": "/e-link/documents/",
  "message": "It seems your SPARQL template is not correctly defined.",
  "error": "Bad Request",
  "status": 400,
  "timestamp": 1462266259397
}

This error is input-text-specific, some other cases there is no error.

jnehring commented 8 years ago

@ArneBinder please try to reproduce this bug and find out whats wrong.

ArneBinder commented 8 years ago

The problem seems to rely on com.hp.hpl.jena.query. When executing the following e-link request:

curl -X POST -H "Content-Type: text/turtle" -H "Accept: text/turtle" -d '@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .
@prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> .
@prefix nif:   <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .

<http://freme-project.eu/#char=0,38>
        a               nif:RFC5147String , nif:Context , nif:String ;
        nif:beginIndex  "0"^^xsd:int , "0"^^xsd:nonNegativeInteger ;
        nif:endIndex    "38"^^xsd:int , "38"^^xsd:nonNegativeInteger ;
        nif:isString    "France and Germany came up with a plan"^^xsd:string .

<http://freme-project.eu/#char=0,6>
        a                     nif:RFC5147String , nif:Word , nif:String , nif:Phrase ;
        nif:anchorOf          "France"^^xsd:string ;
        nif:beginIndex        "0"^^xsd:int ;
        nif:endIndex          "6"^^xsd:int ;
        nif:referenceContext  <http://freme-project.eu/#char=0,38> ;
        itsrdf:taClassRef     <http://dbpedia.org/ontology/PopulatedPlace> , <http://dbpedia.org/ontology/Place> , <http://dbpedia.org/ontology/Location> , <http://dbpedia.org/ontology/Country> , <http://nerd.eurecom.fr/ontology#Location> ;
        itsrdf:taConfidence   "0.8533452900894164"^^xsd:double ;
        itsrdf:taIdentRef     <http://dbpedia.org/resource/France> .

<http://freme-project.eu/#char=11,18>
        a                     nif:Word , nif:RFC5147String , nif:Phrase , nif:String ;
        nif:anchorOf          "Germany"^^xsd:string ;
        nif:beginIndex        "11"^^xsd:int ;
        nif:endIndex          "18"^^xsd:int ;
        nif:referenceContext  <http://freme-project.eu/#char=0,38> ;
        itsrdf:taClassRef     <http://dbpedia.org/ontology/Place> , <http://nerd.eurecom.fr/ontology#Location> , <http://dbpedia.org/ontology/Country> , <http://dbpedia.org/ontology/PopulatedPlace> , <http://dbpedia.org/ontology/Location> ;
        itsrdf:taConfidence   "0.8074708147491738"^^xsd:double ;
        itsrdf:taIdentRef     <http://dbpedia.org/resource/Germany> .' "http://localhost:8080/e-link/documents?templateid=2"

with a small modification in exception/error passing I get:

{
  "exception": "eu.freme.common.exception.BadRequestException",
  "path": "/e-link/documents",
  "message": "Could not process the enrichment result from the endpoint=http://live.dbpedia.org/sparql executing the query=PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX dbpedia-owl: <http://dbpedia.org/ontology/> PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> CONSTRUCT { ?event <http://dbpedia.org/ontology/place> <http://dbpedia.org/resource/Germany> . } WHERE { ?event <http://dbpedia.org/ontology/place> <http://dbpedia.org/resource/Germany> .  } LIMIT 10. Error message: [line: 3, col: 19] Unknown char: –(8211;0x2013)",
  "error": "Bad Request",
  "status": 400,
  "timestamp": 1462814026872
}

Calling the referenced dbpedia sparql endpoint with the query mentioned in the exception gives this result:

@prefix dbo:    <http://dbpedia.org/ontology/> .
@prefix dbr:    <http://dbpedia.org/resource/> .
dbr:Battle_of_Jena–Auerstedt  dbo:place   dbr:Germany .
dbr:Battle_of_Maxen dbo:place   dbr:Germany .
dbr:Battle_of_Meissen   dbo:place   dbr:Germany .
dbr:Battle_of_Minden    dbo:place   dbr:Germany .
dbr:Battle_of_Torgau    dbo:place   dbr:Germany .
dbr:Battle_of_Friedlingen   dbo:place   dbr:Germany .
dbr:Saxon_Wars  dbo:place   dbr:Germany .
<http://dbpedia.org/resource/Battle_of_Berlin_(RAF_campaign)>   dbo:place   dbr:Germany .
dbr:Siege_of_Weinsberg  dbo:place   dbr:Germany .
<http://dbpedia.org/resource/Battle_of_the_Heligoland_Bight_(1939)> dbo:place   dbr:Germany .

Trying to read the char of the uri dbr:Battle_of_Jena–Auerstedt in the function com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP:execModel(Model model) leads to the exception.

jnehring commented 8 years ago

@m1ci the bug originates from e-link. Please take a look at the bug.

m1ci commented 8 years ago

The issue is within Jena v 2.11.2 with the – char which is considered as invalid. We should consider upgrading Jena. I found Jena dependency in FREMECommon https://github.com/freme-project/FREMECommon/blob/master/pom.xml @jnehring Where else is Jena set as dependency?

jnehring commented 8 years ago

I think that Jena dependency is only defined in FREMECommon. I will update it to the latest version.

m1ci commented 8 years ago

please upgrade to Jena 2.13.0 This Jena version is supported by Linked Data Fragments

jnehring commented 8 years ago

@ArneBinder please update FREMECommon Jena dependency to Jena 2.13.0 and check if the bug is fixed by that.

jnehring commented 8 years ago

We tried the upgrade to Jena 2.13.0 and it did not solve the problem. We cannot upgrade to higher versions of Jena because the LDF client is not compatible to higher Jena versions.

So we cannot fix this bug.

andish commented 8 years ago

See 6.4 Escape Sequences at https://www.w3.org/TR/turtle/#reserved

Seems that in turtle, the IRI cannot contain -.

I asked that DBpedia endpoint - it returns more than 13 000 resource IRIs containg character -. So i guest than this error will come up regularly.

So one solution would be asking the endpoint for other rdf syntax, e.g triples. [in case if e-link asks n3/turtle from endpoints].

jnehring commented 8 years ago

great idea! thanks @andish for the research and the proposal. we will discuss this at next developers call.

borriellom commented 7 years ago

Any development on this issue?

jnehring commented 7 years ago

No new developments. I put this issue on the agenda of next developers call.

jnehring commented 7 years ago

On todays dev call we agreed that @sandroacoelho and @m1ci will work on the issue

m1ci commented 7 years ago

the issue is fixed, @andish @borriellom please check!

borriellom commented 7 years ago

It's fixed. Thank you!

jnehring commented 7 years ago

Fixed on freme-live through the release