gaurav / extraction-framework

The software used to extract structured data from Wikipedia
1 stars 0 forks source link

Update MappingExtractor to use the Commons namespaces and try mapping a few templates #7

Closed gaurav closed 10 years ago

gaurav commented 10 years ago

Start with the most popular templates that might provide interesting metadata. The goal here is to prepare a dataset showcasing the sort of data we might be able to provide by the end of this project to the Commons community, who I hope to e-mail by the end of this week; so think about simple templates with rich metadata whose RDF can be immediately put to work.

gaurav commented 10 years ago

I've created a template at http://mappings.dbpedia.org/index.php/Mapping_commons:Authority_control (stole it from Mapping_en:Authority_control) and I'm trying to get the MappingExtractor to work now.

gaurav commented 10 years ago

Bear in mind that the MappingExtractor cannot handle multiple languages within the same template at the moment, so something like {{VN|en=in English|fr=in French}} cannot currently be handled automatically.

jimkont commented 10 years ago

The commons stats are now deployed online ;) http://mappings.dbpedia.org/server/statistics/commons/

gaurav commented 10 years ago

Awesome -- and look what I found in the top-100 list! http://mappings.dbpedia.org/server/templatestatistics/commons/?template=Coleoptera

gaurav commented 10 years ago
gaurav commented 10 years ago
gaurav commented 10 years ago
gaurav commented 10 years ago
jimkont commented 10 years ago

this is fixed with dbpedia/extraction-framework#242 it won't be deployed soon but keep working on the mappings

If we go this way, maybe we should create a category for all the licence mappings

gaurav commented 10 years ago
gaurav commented 10 years ago

Not enough time left, closing.