dbpedia / extraction-framework

The software used to extract structured data from Wikipedia
855 stars 269 forks source link

questionable subject IRI generation for sr chapter (no usage of default local name for category prefix) #573

Open JJ-Author opened 5 years ago

JJ-Author commented 5 years ago

See http://mappings.dbpedia.org/server/extraction/sr/extract?revid=19189789&format=trix&extractors=custom

the extraction framework outputs the following iri for this resource http://sr.dbpedia.org/resource/Project_talk:Администраторска_табла

however the actual namespace (and wikipedia article) name is https://sr.wikipedia.org/wiki/Разговор_о_Википедији:Администраторска_табла

through http://sr.wikipedia.org/wiki/Project_talk:Администраторска_табла you still get redirected to the full cyrillic name so it is not critcal

jimkont commented 5 years ago

Not 100% sure but maybe you need to run the settings re-generation to get this fixed, iirc this updates all the Wikipedia related configuration

see https://github.com/dbpedia/extraction-framework/wiki/Extraction-Instructions#generate-settings

m1ci commented 4 years ago

Not 100% sure but maybe you need to run the settings re-generation to get this fixed, iirc this updates all the Wikipedia related configuration

see https://github.com/dbpedia/extraction-framework/wiki/Extraction-Instructions#generate-settings

@Vehnem are these settings used by marvin-config?

m1ci commented 4 years ago

.... maybe a test would be nice to have to handle this type of problems in future. \cc @Vehnem