dbpedia / mappings-tracker

This project is used for tracking mapping issues in mappings.dbpedia.org
9 stars 6 forks source link

Clean-up property mappings with 0 occurrences #58

Open jimkont opened 9 years ago

jimkont commented 9 years ago

There are old property mappings with 0 occurrences e.g. http://mappings.dbpedia.org/server/templatestatistics/en/?template=Infobox_settlement has the following:

na  LandArea_sq_mi
na  MetroArea_sq_mi
na  TotalArea_sq_mi
na  UrbanArea_sq_mi
na  WaterArea_sq_mi
na  area_land
na  area_metro
na  area_total
na  area_urban
na  area_water
na  pop_est
na  population_density
na  population_density_metro_mi2
na  population_density_mi2
na  population_density_urban_mi2
na  utc_offset1_DST
na  utc_offset2_DST

All these should be removed. There is no harm keeping them but they offer nothing and keep the mapping pages too long. If we get a list we can do this automatically with a bot

Nono314 commented 9 years ago

Some mappings in need of a serious cleanup:

Those have obviously been initialized with a boilerplate mapping encompassing all properties of PopulatedPlace (as were many other mappings) but have never been fixed since then though they had a number of minor changes.

VladimirAlexiev commented 9 years ago

To make a list, does one need to write a program that calls stats on every template and collects the "na"?

jimkont commented 9 years ago

Something like that, It should be easy to do this from the server module but didn't have time to look into this so far

Nono314 commented 9 years ago

https://github.com/dbpedia/extraction-framework/pull/367 should help by identifying affected templates directly in the templates statistics list without having to go through each one to find out..

@jimkont Yes the mapping server already holds everything needed to list them all together.

Nono314 commented 9 years ago

Note that properties should be checked before their mappings being removed. It seems some properties are flagged as not found while they should not.

@jimkont There was this report where you answered that the template has changed but it seems the OP was right: the properties exist in both the template definition and uses (for example Abraham Lincoln first in what links here)