Closed jluisred closed 10 years ago
The bug is solved - no more duplicated entities in the response. Test request:
curl --insecure -X POST --data-binary @BERLIN-2013-04-04-21-45-53-04042145_Chapter.txt "https://entityclassifier.eu/thd/api/v2/extraction?lang=de&entity_type=all&provenance=thd,dbpedia&types_filter=dbo&knowledge_base=linkedHypernymsDataset&priority_entity_linking=true&apikey=xxxxxxxxxxxxxxxx&format=json"
Executing the following call against THD v.2:
curl --insecure -X POST --data-binary @BERLIN-2013-04-04-21-45-53-04042145_Chapter.txt "https://entityclassifier.eu/thd/api/v2/extraction?lang=de&entity_type=all&provenance=thd,dbpedia&types_filter=dbo&knowledge_base=linkedHypernymsDataset&priority_entity_linking=true&apikey=**************************&format=json"
The returned JSON contains many duplicated entities, like for example the first item in the response:
[ { "startOffset":568, "endOffset":573, "underlyingString":"Polen", "entityType":"named entity", "types":[] }, { "startOffset":568, "endOffset":573, "underlyingString":"Polen", "entityType":"named entity", "types":[] }, ...
The file posted is available here: https://dl.dropboxusercontent.com/u/4909358/BERLIN-2013-04-04-21-45-53-04042145_Chapter.txt