dbpedia-spotlight / wikipedia-stats-extractor

Raw Wikipedia counts for entity linking
19 stars 5 forks source link

JSON-WikiPedia links issue in images and gallaries #5

Open nmadhire opened 9 years ago

nmadhire commented 9 years ago

Example

William Godwin is annotated 5 times including one on the image in Anarchism article but the actual article text will have "William Godwin" only 4 times because articletext doesn't have image content. This will lead to discrepancies in the surface form counts with 5 annotated count and 4 total count.

Currently, these are included in links element in the JSON output with span (0,0)

Should come up with an approach to actually eliminate these.