freme-project / freme-ner

Apache License 2.0
6 stars 1 forks source link

Single quote spotted as entity #167

Open x-fran opened 7 years ago

x-fran commented 7 years ago

This is the content file:
freme.txt

cURL:

curl -X POST --header 'Content-Type: text/plain' --header 'Accept: text/turtle' -d @freme.txt 'https://api.freme-project.eu/current/e-entity/freme-ner/documents?language=en&dataset=dbpedia&mode=all&nif-version=2.1' >> freme_out.txt

And this is the problem:


<http://freme-project.eu/#offset_2068_2069>
        a                     nif:OffsetBasedString , nif:Phrase ;
        nif:anchorOf          "‘"^^xsd:string ;
        nif:annotationUnit    [ a                       nif:EntityOccurrence ;
                                nif:taMsClassRef        <http://dbpedia.org/ontology/Country> ;
                                itsrdf:taAnnotatorsRef  <http://freme-project.eu/tools/freme-ner> ;
                                itsrdf:taClassRef       <http://dbpedia.org/ontology/Location> , <http://dbpedia.org/ontology/Place> , <http://www.w3.org/2002/07/owl#Thing> , <http://dbpedia.org/ontology/Country> , <http://dbpedia.org/ontology/PopulatedPlace> ;
                                itsrdf:taConfidence     "0.3296539006884792"^^xsd:double ;
                                itsrdf:taIdentRef       <http://dbpedia.org/resource/United_States>
m1ci commented 7 years ago

this happens due to the unclean nature of the content. FREME NER expects "clean" content with text with regular sentences (to some extend).

m1ci commented 7 years ago

leaving the issue open for possible further developments.