This repository contains the Named Entity Disambiguation tool based on DBpedia Spotlight. Providing that a DBpedia Spotlight Rest server is running, the EHU-ned module will take KAF as input (containing <entities> elements) and perform Named Entity Disambiguation for your language of choice. Developed by IXA NLP Group (ixa.si.ehu.es).
ixa-pipe-ned raises an exception if the input file contains an 'emoticon' xml character. Strange enough, it reports a different character (�) than the one in the input (😣)
$ java -jar $MDIR/ixa-pipe-ned/target/ixa-pipe-ned-1.1.6.jar -p 2060 < /tmp/test.naf > /tmp/test2.naf
INFO 2016-07-17 15:43:31,131 main [DBpediaSpotlightClient] - Querying API.
[Fatal Error] :2:287: Character reference "�" is an invalid XML character.
Disambiguation failed:
java.lang.NullPointerException
at ixa.pipe.ned.Annotate.disambiguate2KAF(Annotate.java:295)
at ixa.pipe.ned.Annotate.XMLSpot2KAF(Annotate.java:282)
at ixa.pipe.ned.Annotate.disambiguateNEsToKAF(Annotate.java:70)
at ixa.pipe.ned.CLI.parseCLI(CLI.java:97)
at ixa.pipe.ned.CLI.main(CLI.java:28)
ixa-pipe-ned raises an exception if the input file contains an 'emoticon' xml character. Strange enough, it reports a different character (�) than the one in the input (😣)
Input file: https://gist.github.com/vanatteveldt/11a99358916711a9afa62132d7db5e85. Manually replacing the smileys by "X" solves the issue.