dbpedia / mappings-tracker

This project is used for tracking mapping issues in mappings.dbpedia.org
9 stars 6 forks source link

don't nest IntermediateNodes #53

Open VladimirAlexiev opened 9 years ago

VladimirAlexiev commented 9 years ago

@jplu: I don't think the extractor supports nesting of intermediate nodes (and I haven't yet seen a need for such).

You use such eg at http://mappings.dbpedia.org/index.php/Mapping_fr:Infobox_Territoire (any other maps?)


 | pib                    = 23,6 Mrd USD (2011 est.)
 | pib_hab                = 4,452 USD (2011 est.)

But in the extracted data http://mappings.dbpedia.org/server/extraction/fr/extract?title=Gouvernement%20régional%20du%20Kurdistan&revid=&format=turtle-triples&extractors=custom

   <http://fr.dbpedia.org/resource/Gouvernement_régional_du_Kurdistan__3> .
  a <http://dbpedia.org/ontology/GrossDomesticProduct> ;
   "24"^^<http://www.w3.org/2001/XMLSchema#nonNegativeInteger> .

there's only PIB and not PIB_hab.

Also, as you can see, the PIB data is very poor. If you'd like, make a separate issue for that and I'll try to help

Nono314 commented 9 years ago

Both grossDomesticProduct and grossDomesticProductNominalPerCapita being properties of PopulatedPlace, I'm not even totally sure the nesting was on purpose...

But there are a lot of problems here... Namely:

jplu commented 9 years ago

Thanks for having fixed this!

Nono314 commented 9 years ago

I fixed the 2 properties in the ontology so that mappings in other languages work again. I also fixed Territoire and Pays (the latter also had intermediate nodes for names in other languages). There are still Ancienne entité territoriale and Ville de Chine not yet addressed.

I also did some additions in https://github.com/dbpedia/extraction-framework/pull/363 that should help us get the right order of magnitude for the GDP values.

VladimirAlexiev commented 9 years ago

Thanks for catching this!

@Nono314, @jimkont : We need a best practice on mapping Measurements (also https://github.com/dbpedia/mappings-tracker/issues/42). Proposal:

Do you agree?

Nono314 commented 9 years ago

I was also considering a more generic class instead of all the specialized ones created by Julien but:

I think almost all cases also exist in other languages and are mapped using simple properties. So I would keep with your 1st proposal: stay consistent with what already exists. In the longer term, it would indeed be nice to be able attach a year or a qualifier to the value.

As a side note, this current bug https://github.com/dbpedia/extraction-framework/issues/364 is also a good reason not to use so frequently intermediate nodes with many properties that may not be filled.

VladimirAlexiev commented 9 years ago

@Nono314 I can't grasp your two bullets above, please elaborate

Nono314 commented 9 years ago

Maybe I was too much sticking to the current implementation...

My assumption was that the value property would be xsd:float with the actual unit stored in the unit property, basically as they appear in the wiki content.


For all this you would have to rely on each mapping explicitly setting at least a dimension, in which case, the conversion could actually still happen.

I totally agree with your second point.