fusepoolP3 / p3-dictionary-matcher-transformer

Dictionary Matcher is P3 transformer for SKOS based entity extraction.
Apache License 2.0
2 stars 3 forks source link

URIs in results do not match their dereferentiation #6

Open retog opened 9 years ago

retog commented 9 years ago

For example

With the query

$ curl -X POST -d "Frauds and Swindlings cause significant concerns with regards to Ethics." "http://sandbox.fusepool.info:8301/?taxonomy=http://data.nytimes.com/descriptors.rdf"

I get (extract)

<http://sandbox.fusepool.info:8301/ae85b2ab-4347-42a5-9758-edcb547dd635#annotation-body1>
      a       <http://vocab.fusepool.info/fam#LinkedEntity> ;
      <http://vocab.fusepool.info/fam#extracted-from>
              <http://sandbox.fusepool.info:8301/ae85b2ab-4347-42a5-9758-edcb547dd635> ;
      <http://vocab.fusepool.info/fam#selector>
              <http://sandbox.fusepool.info:8301/ae85b2ab-4347-42a5-9758-edcb547dd635#char=65,71> .

Dereferencing http://sandbox.fusepool.info:8301/ae85b2ab-4347-42a5-9758-edcb547dd635 does not return the text from which the annotations are extracted. Clearly it is not the business of the transformer to store the posted text, but the transformer shouldn't just make up HTTP URI that dereference to something quite unrelated. As no dereferenceable URI can be provided the transformer should use Blank Nodes or alternatively non dereferenceable URNs. The blank node approach is better as it means that annotating the same text with the same taxonomy generates the same result (i.e. an isomorphic graph)

retog commented 9 years ago

@westei do you know how to use the #char when using URNs?