dbpedia / mappings-tracker

This project is used for tracking mapping issues in mappings.dbpedia.org
9 stars 6 forks source link

don't nest IntermediateNodes #53

Open VladimirAlexiev opened 9 years ago

VladimirAlexiev commented 9 years ago

@jplu: I don't think the extractor supports nesting of intermediate nodes (and I haven't yet seen a need for such).

You use such eg at http://mappings.dbpedia.org/index.php/Mapping_fr:Infobox_Territoire (any other maps?)

https://fr.wikipedia.org/w/index.php?title=Gouvernement_régional_du_Kurdistan&action=edit

 | pib                    = 23,6 Mrd USD (2011 est.)
 | pib_hab                = 4,452 USD (2011 est.)

But in the extracted data http://mappings.dbpedia.org/server/extraction/fr/extract?title=Gouvernement%20régional%20du%20Kurdistan&revid=&format=turtle-triples&extractors=custom

<http://fr.dbpedia.org/resource/Gouvernement_régional_du_Kurdistan>
 <http://dbpedia.org/ontology/grossDomesticProduct>
   <http://fr.dbpedia.org/resource/Gouvernement_régional_du_Kurdistan__3> .
<http://fr.dbpedia.org/resource/Gouvernement_régional_du_Kurdistan__3>
  a <http://dbpedia.org/ontology/GrossDomesticProduct> ;
  <http://dbpedia.org/ontology/value>
   "24"^^<http://www.w3.org/2001/XMLSchema#nonNegativeInteger> .

there's only PIB and not PIB_hab.

Also, as you can see, the PIB data is very poor. If you'd like, make a separate issue for that and I'll try to help

Nono314 commented 9 years ago

Both grossDomesticProduct and grossDomesticProductNominalPerCapita being properties of PopulatedPlace, I'm not even totally sure the nesting was on purpose...

But there are a lot of problems here... Namely:

jplu commented 9 years ago

Thanks for having fixed this!

Nono314 commented 9 years ago

I fixed the 2 properties in the ontology so that mappings in other languages work again. I also fixed Territoire and Pays (the latter also had intermediate nodes for names in other languages). There are still Ancienne entité territoriale and Ville de Chine not yet addressed.

I also did some additions in https://github.com/dbpedia/extraction-framework/pull/363 that should help us get the right order of magnitude for the GDP values.

VladimirAlexiev commented 9 years ago

Thanks for catching this!

@Nono314, @jimkont : We need a best practice on mapping Measurements (also https://github.com/dbpedia/mappings-tracker/issues/42). Proposal:

Do you agree?

Nono314 commented 9 years ago

I was also considering a more generic class instead of all the specialized ones created by Julien but:

I think almost all cases also exist in other languages and are mapped using simple properties. So I would keep with your 1st proposal: stay consistent with what already exists. In the longer term, it would indeed be nice to be able attach a year or a qualifier to the value.

As a side note, this current bug https://github.com/dbpedia/extraction-framework/issues/364 is also a good reason not to use so frequently intermediate nodes with many properties that may not be filled.

VladimirAlexiev commented 9 years ago

@Nono314 I can't grasp your two bullets above, please elaborate

Nono314 commented 9 years ago

Maybe I was too much sticking to the current implementation...

My assumption was that the value property would be xsd:float with the actual unit stored in the unit property, basically as they appear in the wiki content.

Currently:

For all this you would have to rely on each mapping explicitly setting at least a dimension, in which case, the conversion could actually still happen.

I totally agree with your second point.