freme-project / e-Entity

Apache License 2.0
1 stars 1 forks source link

FREME NER - Brackets coverted to odd characters #57

Closed borriellom closed 8 years ago

borriellom commented 8 years ago

Sometimes it happens that some entities are spotted and the related anchorOf text contains odd characters. It only happens when there is a bracket next to the string. In detail “(“ character is converted to “-LRB-“ and the character “)” is converted to “-BBR-“. The text contained in the context reference is correct. See resource http://freme-project.eu/#char=113,126 in the example below.

HTTP request

curl -X POST --header "Content-Type: " http://api-dev.freme-project.eu/current/e-entity/freme-ner/documents?input=They%20were%20common%20in%20parts%20of%20Ireland%20and%20the%20Scottish%20Highlands%20in%20the%2019th%20century%2C%20as%20well%20as%20in%20Somerset%20(see%20Punkie%20Night).%20&informat=text&outformat=turtle&language=en&dataset=dbpedia&mode=all”
@prefix dbpedia-fr: <http://fr.dbpedia.org/resource/> .
@prefix dbc:   <http://dbpedia.org/resource/Category:> .
@prefix dbpedia-es: <http://es.dbpedia.org/resource/> .
@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .
@prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> .
@prefix dbpedia: <http://dbpedia.org/resource/> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
@prefix nif:   <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .
@prefix dbpedia-de: <http://de.dbpedia.org/resource/> .
@prefix dbpedia-ru: <http://ru.dbpedia.org/resource/> .
@prefix freme-onto: <http://freme-project.eu/ns#> .
@prefix dbpedia-nl: <http://nl.dbpedia.org/resource/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix dbpedia-it: <http://it.dbpedia.org/resource/> .

<http://freme-project.eu/#char=45,63>
        a                     nif:Phrase , nif:String , nif:Word , nif:RFC5147String ;
        nif:anchorOf          "Scottish Highlands"^^xsd:string ;
        nif:beginIndex        "45"^^xsd:int ;
        nif:endIndex          "63"^^xsd:int ;
        nif:referenceContext  <http://freme-project.eu/#char=0,127> ;
        itsrdf:taClassRef     <http://nerd.eurecom.fr/ontology#Location> ;
        itsrdf:taConfidence   "0.7879317198596104"^^xsd:double ;
        itsrdf:taIdentRef     dbpedia:Scottish_Highlands .

<http://freme-project.eu/#char=29,36>
        a                     nif:RFC5147String , nif:Phrase , nif:String , nif:Word ;
        nif:anchorOf          "Ireland"^^xsd:string ;
        nif:beginIndex        "29"^^xsd:int ;
        nif:endIndex          "36"^^xsd:int ;
        nif:referenceContext  <http://freme-project.eu/#char=0,127> ;
        itsrdf:taClassRef     <http://nerd.eurecom.fr/ontology#Location> ;
        itsrdf:taConfidence   "0.8247855650576696"^^xsd:double ;
        itsrdf:taIdentRef     dbpedia:Ireland .

<http://freme-project.eu/#char=113,126>
        a                     nif:String , nif:Word , nif:Phrase , nif:RFC5147String ;
        nif:anchorOf          "Punkie Night -RRB-"^^xsd:string ;
        nif:beginIndex        "113"^^xsd:int ;
        nif:endIndex          "126"^^xsd:int ;
        nif:referenceContext  <http://freme-project.eu/#char=0,127> ;
        itsrdf:taClassRef     <http://www.w3.org/2002/07/owl#Thing> ;
        itsrdf:taConfidence   "0.8559797394750305"^^xsd:double .

<http://freme-project.eu/#char=0,127>
        a               nif:String , nif:Context , nif:RFC5147String ;
        nif:beginIndex  "0"^^xsd:int ;
        nif:endIndex    "127"^^xsd:int ;
        nif:isString    "They were common in parts of Ireland and the Scottish Highlands in the 19th century, as well as in Somerset (see Punkie Night)."^^xsd:string .

<http://freme-project.eu/#char=99,107>
        a                     nif:String , nif:Word , nif:RFC5147String , nif:Phrase ;
        nif:anchorOf          "Somerset"^^xsd:string ;
        nif:beginIndex        "99"^^xsd:int ;
        nif:endIndex          "107"^^xsd:int ;
        nif:referenceContext  <http://freme-project.eu/#char=0,127> ;
        itsrdf:taClassRef     <http://nerd.eurecom.fr/ontology#Location> ;
        itsrdf:taConfidence   "0.8250164350531494"^^xsd:double ;
        itsrdf:taIdentRef     dbpedia:Somerset .
m1ci commented 8 years ago

thanks for reporting, @nilesh-c please check whats wrong.

m1ci commented 8 years ago

fixed. please check:

curl -X POST --header "Content-Type: " "http://api-dev.freme-project.eu/current/e-entity/freme-ner/documents?input=They%20were%20common%20in%20parts%20of%20Ireland%20and%20the%20Scottish%20Highlands%20in%20the%2019th%20century%2C%20as%20well%20as%20in%20Somerset%20(see%20Punkie%20Night).%20&informat=text&outformat=turtle&language=en&dataset=dbpedia&mode=all" -v
borriellom commented 8 years ago

It's fixed. Well done! I close the issue.