openbudgets / pipeline-fragments

Reusable fragments of LinkedPipes ETL pipelines
2 stars 3 forks source link

Question regarding partial property #19

Closed fathoni closed 7 years ago

fathoni commented 7 years ago

Hi @marek-dudas , I was transforming the Madrid 2017 dataset from FDP to RDF. You can view my mapping on OS packager here. I found the following snippet after the FDP2RDF transformation:

...
<http://data.openbudgets.eu/resource/dataset/test_budgetary_central_government_expense/observation/6> <http://data.openbudgets.eu/ontology/dsd/test_budgetary_central_government_expense/dimension/administrative-classification> <http://data.openbudgets.eu/resource/dataset/test_budgetary_central_government_expense/administrative-classification/1> .
<http://data.openbudgets.eu/resource/dataset/test_budgetary_central_government_expense/administrative-classification/1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2004/02/skos/core#Concept> .
<http://data.openbudgets.eu/resource/dataset/test_budgetary_central_government_expense/administrative-classification/1> <http://www.w3.org/2004/02/skos/core#prefLabel> "1"^^<http://www.w3.org/2001/XMLSchema#string> .
<http://data.openbudgets.eu/resource/dataset/test_budgetary_central_government_expense/administrative-classification/1> <http://www.w3.org/2004/02/skos/core#inScheme> <http://data.openbudgets.eu/resource/test_budgetary_central_government_expense/codelist/administrative-classification> .
<http://data.openbudgets.eu/resource/test_budgetary_central_government_expense/codelist/administrative-classification> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2004/02/skos/core#ConceptScheme> .
<http://data.openbudgets.eu/resource/test_budgetary_central_government_expense/codelist/administrative-classification> <http://www.w3.org/2004/02/skos/core#hasTopConcept> <http://data.openbudgets.eu/resource/dataset/test_budgetary_central_government_expense/administrative-classification/1> .
<http://data.openbudgets.eu/resource/dataset/test_budgetary_central_government_expense/economic-classification/1-11600> <http://data.openbudgets.eu/ontology/dsd/test_budgetary_central_government_expense/partialproperty/Capitulo> "1"^^<http://www.w3.org/2001/XMLSchema#string> .
<http://data.openbudgets.eu/resource/dataset/test_budgetary_central_government_expense/economic-classification/1-11600> <http://data.openbudgets.eu/ontology/dsd/test_budgetary_central_government_expense/partialproperty/Descripcion_Capitulo> "IMPUESTOS DIRECTOS"^^<http://www.w3.org/2001/XMLSchema#string> .
<http://data.openbudgets.eu/resource/dataset/test_budgetary_central_government_expense/economic-classification/1-11600> <http://data.openbudgets.eu/ontology/dsd/test_budgetary_central_government_expense/partialproperty/Economico> "11600"^^<http://www.w3.org/2001/XMLSchema#string> .
<http://data.openbudgets.eu/resource/dataset/test_budgetary_central_government_expense/economic-classification/1-11600> <http://data.openbudgets.eu/ontology/dsd/test_budgetary_central_government_expense/partialproperty/Descripcion_Economico> "IMPUESTO S/ EL INCREMENTO DE VALOR DE LOS TERRENOS"^^<http://www.w3.org/2001/XMLSchema#string> .
<http://data.openbudgets.eu/resource/dataset/test_budgetary_central_government_expense/observation/6> <http://data.openbudgets.eu/ontology/dsd/test_budgetary_central_government_expense/dimension/economic-classification> <http://data.openbudgets.eu/resource/dataset/test_budgetary_central_government_expense/economic-classification/1-11600> .
... 

Regarding this:

  1. What is the meaning of partial partialproperty?
  2. As for the mapping of the classification description, wouldn't it be better to map it using skos:prefLabel instead of partialproperty/Descripcion_Capitulo ?
marek-dudas commented 7 years ago

skos:prefLabel is used to map FDP attributes based on the labelfor property. If it is not there, the pipeline can't determine more specific meaning for the FDP attribute, and that is when the "partialproperty" is created. It is called so to differ from dimension properties. These "partial" properties link the dimension value objects to literal values.

fathoni commented 7 years ago

Could you specify which particular mapping would be mapped to skos:prefLabel based the OS packager hierarchy? I have been using Description (administrative-classification ❯ generic ❯ description) , Description Name (economic-classification ❯ generic ❯ level1 ❯ description) , Description (economic-classification ❯ generic ❯ description). Please let me know if these are not the proper mapping.

pwalsh commented 7 years ago

@fathoni those UI concepts you refer to are "OS Data Types", which are a "semantic" abstraction on top of the core, generic FDP. As discussed recently here https://github.com/openbudgets/pipeline-fragments/issues/18#issuecomment-282656434 @HimmelStein was to work on a mapping of these types to the OBEU types after our summit in Thessaloniki last year. I do not know the status of that work, but without that work is is unlikely that the mapping works as you are expecting.

fathoni commented 7 years ago

I see, @HimmelStein made the draft here, seems that it needs to be continued for the FDP2RDF transformation task.

HimmelStein commented 7 years ago

hi, I made that mapping doc before Thessaloniki meeting last year. During the Thessaloniki meeting, automatic transformation task is split into two sub-tasks: (1) xml->fdp (UBonn) (2) fdp-> rdf (UEP). as far as I know, fdp->rdf was developed by @marek-dudas (alone). I remember that I suggested during the Thessaloniki meeting to separate the mapping between os data type and obeu data type from the processing, so that updating the mapping does not need to update the processing part. @marek-dudas We need to know which mapping is encoded in the current fdp-> rdf pipeline. should we improve this work together?

marek-dudas commented 7 years ago

So the trick is specifying the label column as "Display name" instead of "Description" in the OS-Packager. Then the labelfor property I spoke about in the beginning gets created in the dimension definition in the FDP descriptor, and the pipeline should map it to the skos:prefLabel. However, I now see that this is implemented for hierarchical classifications (i.e. several columns, each identifying different level of the classification), but not for simple single-level classifications. I will fix that.

marek-dudas commented 7 years ago

The above-mentioned should now be fixed in the latest version of the pipeline. Columns specified as "Display name" for classification dimensions should be transformed into skos:prefLabel values even for non-hierarchical dimensions.