anuzzolese / oke-challenge

19 stars 8 forks source link

Way how to query the tools with Gerbil #22

Open jplu opened 9 years ago

jplu commented 9 years ago

Hi,

We were wondering how the participating tools through Gerbil will be queried. Does Gerbil will send one query with a bunch of sentences and waiting one result or it will send one query per sentence and wait one result per sentence?

Thanks.

MichaelRoeder commented 9 years ago

Hi Julien,

currently, GERBIL is programmed in a way, that it will send a single sentence. After that it will wait for the response of the annotator before the next sentence will be sent.

Cheers, Michael

jplu commented 9 years ago

Thanks Michael.

rtroncy commented 9 years ago

@MichaelRoeder, we are finalizing the API of our system for the OKE challenge, and we are still a bit confused regarding how each system will receive the data to process so that GERBIL can compute our performance. It would tremendously help if you could provide an example!

Hence, the OKE organizers will most likely have a single NIF file containing the test dataset. What will happen next? Gerbil will parse this NIF dataset, split it into sentences, and call our API with some parameters. We need to tell you how to call the system and you need to let us know what precisely you will send us so we need to know what to parse (and return back!). @jplu, can you please re-open this issue?

anuzzolese commented 9 years ago

Hi all, @rtroncy,

you will receive ASAP an email about how configure the REST API in order to make you system accessible via GERBIL.

I am going to report here the main points that @MichaelRoeder described about the integration of the annotators in GERBIL. @MichaelRoeder correct me, if I am wrong.

Basically, we are configuring two experiments in GERBIL for evaluating the annotators. GERBIL interacts with the annotators through REST API. It sends a NIF-compliant turtle containing one ore more sentences to be annotated and expects back a NIF-compliant turtle containing the resulting annotations performed by an annotator.

The following is a cURL example of a request that GERBIL performs to a single annotator.

curl -H "Content-Type:application/x-turtle" -H "Accept:application/x-turtle" -d "Here Goes The NIF-compliant turtle" URI_OF_THE _ANNOTATOR

rtroncy commented 9 years ago

Thanks @anuzzolese, this helps. You wrote that GERBIL will send a NIF-compliant turtle containing one ore more sentences while @MichaelRoeder wrote in this comment that it will be a single sentence. I assume the later is correct.

MichaelRoeder commented 9 years ago

Hi all,

Andrea already described the main points. I just would like to add an example and the URL of a GEBRIL instance that you can use for testing. If you have an annotator for task 1, you will receive POST requests containing a single NIF document as it is defined in the dataset. As the most (or nearly all ?) documents contain only one single sentence, a request will contain something like this:

<http://www.ontologydesignpatterns.org/data/oke-challenge/task-1/sentence-1#char=0,146>
        a                     nif:RFC5147String , nif:String , nif:Context ;
        nif:beginIndex        "0"^^xsd:nonNegativeInteger ;
        nif:endIndex          "146"^^xsd:nonNegativeInteger ;
        nif:isString          "Florence May Harding studied at a school in Sydney, and with Douglas Robert Dundas , but in effect had no formal training in either botany or art."@en .

(I removed the prefix definitions to keep the example easy)

GERBIL expects, that the response of the annotator contains a single NIF document and the annotations:

<http://www.ontologydesignpatterns.org/data/oke-challenge/task-1/sentence-1#char=0,146>
        a                     nif:RFC5147String , nif:String , nif:Context ;
        nif:beginIndex        "0"^^xsd:nonNegativeInteger ;
        nif:endIndex          "146"^^xsd:nonNegativeInteger ;
        nif:isString          "Florence May Harding studied at a school in Sydney, and with Douglas Robert Dundas , but in effect had no formal training in either botany or art."@en .

oke:Florence_May_Harding
     a                        owl:Individual, dul:Person ;
     rdfs:label               "Florence May Harding"@en ;
     owl:sameAs               dbpedia:Florence_May_Harding .

<http://www.ontologydesignpatterns.org/data/oke-challenge/task-1/sentence-1#char=0,20>
        a                     nif:RFC5147String , nif:String ;
        nif:anchorOf          "Florence May Harding"@en ;
        nif:beginIndex        "0"^^xsd:nonNegativeInteger ;
        nif:endIndex          "20"^^xsd:nonNegativeInteger ;
        nif:referenceContext  <http://www.ontologydesignpatterns.org/data/oke-challenge/task-1/sentence-1#char=0,146> ;
        itsrdf:taIdentRef     oke:Florence_May_Harding .

oke:School_1
     a                        owl:Individual, dul:Organization ;
     rdfs:label               "a school"@en .

<http://www.ontologydesignpatterns.org/data/oke-challenge/task-1/sentence-1#char=34,40>
        a                     nif:RFC5147String , nif:String ;
        nif:anchorOf          "school"@en ;
        nif:beginIndex        "34"^^xsd:nonNegativeInteger ;
        nif:endIndex          "40"^^xsd:nonNegativeInteger ;
        nif:referenceContext  <http://www.ontologydesignpatterns.org/data/oke-challenge/task-1/sentence-1#char=0,146> ;
        itsrdf:taIdentRef     oke:National_Art_School .

oke:Sydney
     a                        owl:Individual, d0:Location ;
     rdfs:label               "Sydney"@en ;
     owl:sameAs               dbpedia:Sydney .

<http://www.ontologydesignpatterns.org/data/oke-challenge/task-1/sentence-1#char=44,50>
        a                     nif:RFC5147String , nif:String ;
        nif:anchorOf          "Sydney"@en ;
        nif:beginIndex        "44"^^xsd:nonNegativeInteger ;
        nif:endIndex          "50"^^xsd:nonNegativeInteger ;
        nif:referenceContext  <http://www.ontologydesignpatterns.org/data/oke-challenge/task-1/sentence-1#char=0,146> ;
        itsrdf:taIdentRef     oke:Sydney .

oke:Douglas_Robert_Dundas
     a                        owl:Individual, dul:Person ;
     rdfs:label               "Douglas Robert Dundas"@en .

<http://www.ontologydesignpatterns.org/data/oke-challenge/task-1/sentence-1#char=61,82>
        a                     nif:RFC5147String , nif:String ;
        nif:anchorOf          "Douglas Robert Dundas"@en ;
        nif:beginIndex        "61"^^xsd:nonNegativeInteger ;
        nif:endIndex          "82"^^xsd:nonNegativeInteger ;
        nif:referenceContext  <http://www.ontologydesignpatterns.org/data/oke-challenge/task-1/sentence-1#char=0,146> ;
        itsrdf:taIdentRef     oke:Douglas_Robert_Dundas .

I set up a GERBIL instance on one of our servers, which you can use to test the communication between GERBIL and your annotator: http://139.18.2.164:1235/gerbil Simply open the configuration page, choose the task, skip the "Annotator" drop down menu and add a name and the URL of your annotator into the two fields below. Press the "Add another annotator" button and wait while GERBIL tests the connection to your annotator by sendnig a simple request. After that you can choose one of the OKE example or GS sample datasets and start an experiment.

Note that this GERBIL instance does not contain the evaluation datasets, even if you can choose them in the front end ;-) Chosing them simply leads to an error since I don't have them.

Please let me know if you have problems or further questions.

Cheers, Michael

anuzzolese commented 9 years ago

@MichaelRoeder can you clarify what @rtroncy pointed out, i.e.,

Thanks @anuzzolese, this helps. You wrote that GERBIL will send a NIF-compliant turtle containing one ore more sentences while @MichaelRoeder wrote in this comment that it will be a single sentence. > I assume the later is correct.

This is very important for the description of the evaluation I am going to circulate.

Thanks in advance.

MichaelRoeder commented 9 years ago

Hi all,

GERBIL sends exactly one single document per request.

Unfortunately, NIF does not define a class "document". Thus, we decided that for GERBIL a document is every Resource that is an instance of nif:context and has a property nif:text.

In practice, that means that the example dataset for task 1 contains 2 documents - http://www.ontologydesignpatterns.org/data/oke-challenge/task-1/sentence-1#char=0,146 and http://www.ontologydesignpatterns.org/data/oke-challenge/task-1/sentence-2#char=0,192.

Does this answer the question or did I missunderstood it?

anuzzolese commented 9 years ago

OK, this means that if I have 20 sentences then GERBIL sends 20 requests to each system to evaluate. Is this correct?

MichaelRoeder commented 9 years ago

What I tried to say is that it simply depends on the strucure of your RDF data. If you are defining every sentence as its own nif:context - as it has been done in the example datasets - GERBIL will send 20 requests. If you have only one single context containing all 20 sentences, GERBIL will send only one request.

anuzzolese commented 9 years ago

Perfectly clear.

rtroncy commented 9 years ago

Yes, indeed, clear ... for you @anuzzolese :-) Since you're the only who has seen the test dataset and how you have packaged the rdf data, can you confirm us that it will be shaped as the training dataset and that consequently, we will receive as many requests from GERBIL as there are sentences to process?

anuzzolese commented 9 years ago

Yes, indeed, clear ... for you @anuzzolese

:-) The evaluation dataset is shaped as the training dataset in terms of the nature and the number of sentences. Your system will receive a request for each sentence because we use a nif:context for identifying each sentence in the dataset.

Is this an issue for you @rtroncy ?

rtroncy commented 9 years ago

No problem then, we just had to know what to expect. Now, this is clear (hopefully for all)