freme-project / technical-discussion

This repository is used for technical discussions.
2 stars 0 forks source link

Specification of e-Link #8

Closed jnehring closed 9 years ago

jnehring commented 9 years ago

This ticket specifies the e-link service.

jnehring commented 9 years ago

e-Link should give users the ability to fetch information from ontologies given an entity URL. Following approaches have been discussed:

andish commented 9 years ago

We must also ensure the possibility to add new e-services at any time and not depending on any technological backbone. I would prefer each enrichment e-service as a blackbox with standartized API interface with broker format in and also out.

Isn't e-Link service similar to terminology service in conceptual level? because format conversion is now taken away from this service.

It also must (just idea): 1) take data from posted NIF and process it, 2) search for addition data and add links referencing to external sources and dataset (It would use the sparql endpoint to serch for links) 2.1) use predefined predicates 2.2) use predicates passed as parameters ?? 3) return back data in NIF format. It would use the sparql endpoint to serch for links

fsasaki commented 9 years ago

To add to what Jan wrote about the abstract patterns, here is an example.

1) Abstract query pattern called "query information about persons born in a given location, after a specific year" 2) pattern: [ PREFIX dbo: http://dbpedia.org/ontology/

SELECT ?name ?birth ?death ?person WHERE { ?person dbo:birthPlace @@@location@@@ ?person dbo:birthDate ?birth . ?person foaf:name ?name . ?person dbo:deathDate ?death . FILTER (?birth > "@@@iso-date@@@"^^xsd:date) . } ORDER BY ?name ] 3) Parameters for the pattern: location, e.g. http://dbpedia.org/page/Berlin birth date: ISO year, e.g. 1900-01-01

The difference to what we discussed before is that we can pass not only predefined predicates but everything which may appear in a query.

fsasaki commented 9 years ago

To move this discussion forward: I added the "submitting parameters" option to a tool that I created in the LIDER project, see http://www.w3.org/People/fsasaki/mlod4con/ and http://www.w3.org/People/fsasaki/mlod4con/#parameters The "data wrangler" would set up a json configuration file that maps SPARQL queries to markup output http://www.w3.org/People/fsasaki/mlod4con/mlod4consettings.json I don't approach to use this tool to implement e-Link, it has several shortcomings (e.g. only one sparlq query to one end point possible, no queries of non-linked data sources possible). But the approach of hiding the queries from the API developer or end user may be interesting for FREME.

jnehring commented 9 years ago

The idea of abstract pattern can also realise our idea of submitting a subject and a list of predicates to e-link and geting a list of all matching objects in return. We have to provide a pattern for each number of predicates / objects then. The patterns will look like (example of pattern for two predicates):

SELECT ?object1 ?object2 WHERE {
    OPTIONAL{ @@@subject@@@ @@@predicate1@@@ ?object1 }.
    OPTIONAL{ @@@subject@@@ @@@predicate2@@@ ?object2 }
}

That means we have to provide e.g. 20 of these patterns and users of FREME cannot use these patterns to retrieve more then 20 objects with one API call to e-link.

jnehring commented 9 years ago

In the e-Link call today we discussed the query templates. There are still some open questions, see minutes of the call.

fsasaki commented 9 years ago

hi all, could you summarize the open questions here? I am interesting to see the proposed solution with a question like "“Give me all events around a city within a distance of 100 miles” (the 100 miles part)" allows to provide two parametes (URI for the city and distance)

jnehring commented 9 years ago

The problem with "abstract patterns" as discussed above is that it is not compatible with NIF / outputs of e-Entity. So we discussed more complex abstract patterns like

select ?event where {
?event rdf:type <http://dbpedia.org/ontology/Event> .
?event <http://dbpedia.org/ontology/place> @@@entity of class <http://dbpedia.org/ontology/Place>@@@ .
}

So we can feed a series of NIF annotations in the template, the template engine extracts all suitable annotations, fills in the template and generates a series of SPARQL queries. That has some advantages (easy to use, compatibility) and contras (rather complicated, has ambiguities).

This raises questions like how to define the variables in such templates and if this solution is better then the simple templates.

fsasaki commented 9 years ago

Thanks for the explanation, Jan - maybe to evalute the template from the BC point of view, one could show with the 4 examples from here https://docs.google.com/document/d/1cqYRuBWM0ItNIKYT9IPNCno9axHH9wKFzSWacJMxEZY/edit how this would work? A user who does not know linked data would benefit if she could fill in below templates. It is OK to have a more complex underlying mechanism - my question is to see examples of how that mechanism would be used for below examples. 1) All events in … (variable: a location) 2) Who was born in … (variable: a location) 3) Who was born close to … (two variables: location, radius to specify "close")) 4) Museums close to … (again two variables) Best, Felix

m1ci commented 9 years ago

Here is a show case with one example: the client submits NIF document with entity annotations and enriches the content based on a previously defined template. Assumption: defined template in e-link which enriches each entity of type city with information about "Museums close to a city.". Template definition:

SELECT ?museum
WHERE {
    @@@location@@@ geo:geometry ?citygeo .
    ?museum rdf:type <http://schema.org/Museum> .
    ?museum geo:geometry ?museumgeo .
    FILTER (bif:st_intersects(?museumgeo, ?citygeo, 10))
} LIMIT @@@limit@@@

In document:

@prefix nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .
@prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> .

// NIF document containing entity annotations
<http://example.org/document/1#char=0,24>
    a nif:String , nif:Context , nif:RFC5147String ;
    nif:isString "We talk about Amsterdam."^^xsd:string;
    nif:beginIndex "0"^^xsd:nonNegativeInteger;
    nif:endIndex "21"^^xsd:nonNegativeInteger.
// Entity mention linked with its DBpedia URI and ontology class.
<http://example.org/document/1#char=14,23>
    a nif:String , nif:RFC5147String , nif:Word, nif:Phrase ;
    nif:referenceContext <http://example.org/document/1#char=0,24> ;
    nif:anchorOf "Amsterdam";
    nif:beginIndex "14" ;
    nif:endIndex "20" ;
    itsrdf:taIdentRef  <http://dbpedia.org/resource/Amsterdam> .
    itsrdf:taClassRef  <http://dbpedia.org/ontology/City> ;

The client submits the document to the e-Link service with following request:

POST /e-link/?template-id=museums_around_city&limit=5

E-link accepts the document and enriches each entity of type city with the pre-defined template and return the enriched content (in NIF).

@prefix nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .
@prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

// NIF document containing entity annotations
<http://example.org/document/1#char=0,24>
    a nif:String , nif:Context , nif:RFC5147String ;
    nif:isString "We talk about Amsterdam."^^xsd:string;
    nif:beginIndex "0"^^xsd:nonNegativeInteger;
    nif:endIndex "21"^^xsd:nonNegativeInteger.
// Entity mention linked with its DBpedia URI and ontology class.
<http://example.org/document/1#char=14,23>
    a nif:String , nif:RFC5147String , nif:Word, nif:Phrase ;
    nif:referenceContext <http://example.org/document/1#char=0,24> ;
    nif:anchorOf "Amsterdam";
    nif:beginIndex "14" ;
    nif:endIndex "20" ;
    itsrdf:taIdentRef  <http://dbpedia.org/resource/Amsterdam> .
    itsrdf:taClassRef  <http://dbpedia.org/ontology/City> ;
<http://dbpedia.org/resource/Hash,_Marihuana_&_Hemp_Museum> foaf:based_near <http://dbpedia.org/resource/Amsterdam> 
<http://dbpedia.org/resource/Rijksmuseum> foaf:based_near <http://dbpedia.org/resource/Amsterdam> 
<http://dbpedia.org/resource/Cobra_Museum> foaf:based_near <http://dbpedia.org/resource/Amsterdam> 
<http://dbpedia.org/resource/Van_Gogh_Museum> foaf:based_near <http://dbpedia.org/resource/Amsterdam> 
<http://dbpedia.org/resource/NEMO_(museum)> foaf:based_near <http://dbpedia.org/resource/Amsterdam> 
fsasaki commented 9 years ago

Looks good. In the design of e-Entity, is it possible to document the templates and to use a json blob like below when evoking e-Entity? E.g. in this way:

query description: { "template-name" : "museum-query", "template-description" : "query information related to museums", "query-parameters" : [ {"location" : "URI identifiying the museum for which we want information", "limit" : "limit of results, given as integer value" } ] }

parameters being used: { "template-name" : "museum-query", "query-parameters" : [ {"location" : "http://dbpedia.org/resource/Leipzig", "limit" : "20" } ] }

2015-04-27 12:58 GMT+03:00 Milan Dojčinovski notifications@github.com:

Here is a show case with one example: the client submits NIF document with entity annotations and enriches the content based on a previously defined template. Assumption: defined template in e-link which enriches each entity of type city with information about "Museums close to a city.". Template definition:

SELECT ?museum WHERE { @@@location@@@ geo:geometry ?citygeo . ?museum rdf:type http://schema.org/Museum . ?museum geo:geometry ?museumgeo . FILTER (bif:st_intersects(?museumgeo, ?citygeo, 10)) } LIMIT @@@limit@@@

In document:

@prefix nif: http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core# . @prefix itsrdf: http://www.w3.org/2005/11/its/rdf# .

// NIF document containing entity annotations http://example.org/document/1#char=0,24 a nif:String , nif:Context , nif:RFC5147String ; nif:isString "We talk about Amsterdam."^^xsd:string; nif:beginIndex "0"^^xsd:nonNegativeInteger; nif:endIndex "21"^^xsd:nonNegativeInteger. // Entity mention linked with its DBpedia URI and ontology class. http://example.org/document/1#char=14,23 a nif:String , nif:RFC5147String , nif:Word, nif:Phrase ; nif:referenceContext http://example.org/document/1#char=0,24 ; nif:anchorOf "Amsterdam"; nif:beginIndex "14" ; nif:endIndex "20" ; itsrdf:taIdentRef http://dbpedia.org/resource/Amsterdam . itsrdf:taClassRef http://dbpedia.org/ontology/City ;

The client submits the document to the e-Link service with following request:

POST /e-link/?template-id=museums_around_city&limit=5

E-link accepts the document and enriches each entity of type city with the pre-defined template and return the enriched content (in NIF).

@prefix nif: http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core# . @prefix itsrdf: http://www.w3.org/2005/11/its/rdf# . @prefix foaf: http://xmlns.com/foaf/0.1/ .

// NIF document containing entity annotations http://example.org/document/1#char=0,24 a nif:String , nif:Context , nif:RFC5147String ; nif:isString "We talk about Amsterdam."^^xsd:string; nif:beginIndex "0"^^xsd:nonNegativeInteger; nif:endIndex "21"^^xsd:nonNegativeInteger. // Entity mention linked with its DBpedia URI and ontology class. http://example.org/document/1#char=14,23 a nif:String , nif:RFC5147String , nif:Word, nif:Phrase ; nif:referenceContext http://example.org/document/1#char=0,24 ; nif:anchorOf "Amsterdam"; nif:beginIndex "14" ; nif:endIndex "20" ; itsrdf:taIdentRef http://dbpedia.org/resource/Amsterdam . itsrdf:taClassRef http://dbpedia.org/ontology/City ; http://dbpedia.org/resource/Hash,_Marihuana_&_Hemp_Museum foaf:based_near http://dbpedia.org/resource/Amsterdam http://dbpedia.org/resource/Rijksmuseum foaf:based_near http://dbpedia.org/resource/Amsterdam http://dbpedia.org/resource/Cobra_Museum foaf:based_near http://dbpedia.org/resource/Amsterdam http://dbpedia.org/resource/Van_Gogh_Museum foaf:basednear http://dbpedia.org/resource/Amsterdam http://dbpedia.org/resource/NEMO(museum) foaf:based_near http://dbpedia.org/resource/Amsterdam

— Reply to this email directly or view it on GitHub https://github.com/freme-project/technical-discussion/issues/8#issuecomment-96588936 .

m1ci commented 9 years ago

you mean "e-Link"? Yes, it is possible that we also submit the request as part of the body. However, having this information as part of the query parameters is more developer-friendly - developers can easily test the service by including the this information directly in the URL. Lets see ...

Regarding the second part - getting information only about one entity - a more RESTful way to implement this is as follows:

POST /e-link/entities/{entity-id}/?template-id={template-id}&limit={limit}

Concrete request:

POST /e-link/entities/http://dbpedia.org/resource/Leipzig/?limit=10&template-id=museum-query
fsasaki commented 9 years ago

Ah, yes - I meant e-Link, sorry. And agree about the query parameter, good point.

2015-04-27 15:22 GMT+03:00 Milan Dojčinovski notifications@github.com:

you mean "e-Link"? Yes, it is possible that we also submit the request as part of the body. However, having this information as part of the query parameters is more developer-friendly - developers can easily test the service by including the this information directly in the URL. Lets see ...

Regarding the second part - getting information only about one entity - a more RESTful way to implement this is as follows:

POST /e-link/entities/{entity-id}/?template-id={template-id}&limit={limit}

Concrete request:

POST /e-link/entities/http://dbpedia.org/resource/Leipzig/?limit=10&template-id=museum-query

— Reply to this email directly or view it on GitHub https://github.com/freme-project/technical-discussion/issues/8#issuecomment-96629676 .

fsasaki commented 9 years ago

http://zbw.eu/labs/en/blog/publishing-sparql-queries-live see for a similar approach to e-link.

philinthecloud commented 9 years ago

How would the NIF document with entities enriched look in json-ld?

On Monday, April 27, 2015, Milan Dojčinovski notifications@github.com wrote:

you mean "e-Link"? Yes, it is possible that we also submit the request as part of the body. However, having this information as part of the query parameters is more developer-friendly - developers can easily test the service by including the this information directly in the URL. Lets see ...

Regarding the second part - getting information only about one entity - a more RESTful way to implement this is as follows:

POST /e-link/entities/{entity-id}/?template-id={template-id}&limit={limit}

Concrete request:

POST /e-link/entities/http://dbpedia.org/resource/Leipzig/?limit=10&template-id=museum-query

— Reply to this email directly or view it on GitHub https://github.com/freme-project/technical-discussion/issues/8#issuecomment-96629676 .

m1ci commented 9 years ago

In Turtle:

@prefix nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .
@prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

<http://example.org/document/1#char=0,24>
    a nif:String , nif:Context , nif:RFC5147String ;
    nif:isString "We talk about Amsterdam.";
    nif:beginIndex "0";
    nif:endIndex "21".

<http://example.org/document/1#char=14,23>
    a nif:String , nif:RFC5147String , nif:Word, nif:Phrase ;
    nif:referenceContext <http://example.org/document/1#char=0,24> ;
    nif:anchorOf "Amsterdam";
    nif:beginIndex "14" ;
    nif:endIndex "20" ;
    itsrdf:taIdentRef  <http://dbpedia.org/resource/Amsterdam> ;
    itsrdf:taClassRef  <http://dbpedia.org/ontology/City> .
<http://dbpedia.org/resource/Hash,_Marihuana_&_Hemp_Museum> foaf:based_near <http://dbpedia.org/resource/Amsterdam> .
<http://dbpedia.org/resource/Rijksmuseum> foaf:based_near <http://dbpedia.org/resource/Amsterdam> .
<http://dbpedia.org/resource/Cobra_Museum> foaf:based_near <http://dbpedia.org/resource/Amsterdam> .
<http://dbpedia.org/resource/Van_Gogh_Museum> foaf:based_near <http://dbpedia.org/resource/Amsterdam> .
<http://dbpedia.org/resource/NEMO_(museum)> foaf:based_near <http://dbpedia.org/resource/Amsterdam> .

In JSON-LD:

{
  "@context": {
    "dbpedia": "http://dbpedia.org/resource/",
    "foaf": "http://xmlns.com/foaf/0.1/",
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
    "xsd": "http://www.w3.org/2001/XMLSchema#"
  },
  "@graph": [
    {
      "@id": "dbpedia:NEMO_(museum)",
      "foaf:based_near": {
        "@id": "dbpedia:Amsterdam"
      }
    },
    {
      "@id": "dbpedia:Cobra_Museum",
      "foaf:based_near": {
        "@id": "dbpedia:Amsterdam"
      }
    },
    {
      "@id": "dbpedia:Rijksmuseum",
      "foaf:based_near": {
        "@id": "dbpedia:Amsterdam"
      }
    },
    {
      "@id": "dbpedia:Hash,_Marihuana_&_Hemp_Museum",
      "foaf:based_near": {
        "@id": "dbpedia:Amsterdam"
      }
    },
    {
      "@id": "http://example.org/document/1#char=0,24",
      "@type": [
        "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#String",
        "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#RFC5147String",
        "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#Context"
      ],
      "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#beginIndex": "0",
      "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#endIndex": "21",
      "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#isString": "We talk about Amsterdam."
    },
    {
      "@id": "dbpedia:Van_Gogh_Museum",
      "foaf:based_near": {
        "@id": "dbpedia:Amsterdam"
      }
    },
    {
      "@id": "http://example.org/document/1#char=14,23",
      "@type": [
        "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#Word",
        "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#Phrase",
        "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#RFC5147String",
        "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#String"
      ],
      "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#anchorOf": "Amsterdam",
      "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#beginIndex": "14",
      "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#endIndex": "20",
      "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#referenceContext": {
        "@id": "http://example.org/document/1#char=0,24"
      },
      "http://www.w3.org/2005/11/its/rdf#taClassRef": {
        "@id": "http://dbpedia.org/ontology/City"
      },
      "http://www.w3.org/2005/11/its/rdf#taIdentRef": {
        "@id": "dbpedia:Amsterdam"
      }
    }
  ]
}
pheyvaer commented 9 years ago

Regarding the templates, we use The DataTank. One of the features of the TDT is that you can connect your sparql endpoint to the TDT and query that endpoint using templates you define in TDT. Might be something to look into.

fsasaki commented 9 years ago

In the first prototype e-Link will just use sparql endpoints to fetch data. This does not resolve the usual linked data problem of endpoint availability. When would e-Link implement linked data fragments or a different approach that helps to tackle the issue? Asking now since being here at an W3C meeting I get a lot of feedback on the need to tackle this issue, from people outside FREME.

fsasaki commented 9 years ago

A question on templates in e-Link: https://github.com/freme-project/e-Services/blob/master/e-link/src/main/java/eu/freme/eservices/elink/Template.java defines the Template class, https://github.com/freme-project/e-Services/blob/master/e-link/src/main/java/eu/freme/eservices/elink/DataEnricher.java uses the class. Will it be possible to story and important templates e.g. as JSON blops? Of course the client invoking e-Link can use a json representation to populate the instance of Template. Maybe it would be nice to allow to import such a json representation directly, via a method in DataEnricher?

m1ci commented 9 years ago

Will it be possible to story and important templates e.g. as JSON blops? Maybe it would be nice to allow to import such a json representation directly, via a method in DataEnricher?

Of course! This is something I had in mind. Will add this feature on the TODO list.

When would e-Link implement linked data fragments or a different approach that helps to tackle the issue?

Well, the "Linked data fragments" (LDF) technology also offers a SPARQL endpoint. Currently, there is a stable client for JavaScript and a less stable one for Java. We will indeed integrate LDF in e-Link. Thanks for reminding us!

pheyvaer commented 9 years ago

FYI, to set up LDF you need

fsasaki commented 9 years ago

@pheyvaer , you pointed to the server & client for nodejs. In your view should linked data fragments be integrated in e-Link relying on nodejs or relying on the java version? Is the java version stable enough for e-Link? (See @m1ci comment on stability of nodejs vs. java implementation)

pheyvaer commented 9 years ago

Java version is stable, however atm the only input supported is HDT and the output is turtle.

Whether to choose for Node.js or Java based on the implementation. If you want to have as much Java code as possible then you could go for the Java version. However, at the end the user can choose, they just need a LDF server, whether they use the java version or Node.js doesn't matter. The LDF client can be in any language they want. So client and server don't need to be in the same language. However, you could maybe provide/suggest the java version by default.

fsasaki commented 9 years ago

thanks, @pheyvaer . the user of e-Link probably does not care whether data access is done via java vs. javascript, pure sparql endpoint or linked data fragments etc. So any solution which hides these settings from the user would help. Probably one would need a settings file in e-Link that allows to set up the data sources and access approaches in the manner so that the user does not have to deal with them.

m1ci commented 9 years ago

specification for the 2nd prototype moved to #25