openbudgets / Code-lists

Code list in fiscal data sets
0 stars 3 forks source link

Inconsistent normalization of concept notations #18

Closed jindrichmynarz closed 8 years ago

jindrichmynarz commented 8 years ago

Both CPA and CPC originally use concept notations that contain dots as separators (e.g., 99.00.00 in CPC and 99.00.10 in CPA). However, the SKOS versions use different normalizations of notations for these code lists. While in CPA the dots in notations are preserved, CPC removes them.

I think notations should be normalized only if they used as a part of IRIs. They should be preserved in their original form as objects of skos:notation. For the sake of easy of linking from non-RDF data that contains these codes, it would be better if the normalization was consistent. Ideally, there should be a simple normalization algorithm, that should be consistently applied for all code lists, so that linking code lists automatically based on notations can also use this algorithm.

skarampatakis commented 8 years ago

The notations are in the original form "for the sake of easy of linking from non-RDF data that contains these codes" , as you can see in

http://ec.europa.eu/eurostat/ramon/nomenclatures/index.cfm?TargetUrl=LST_CLS_DLD&StrNom=CPC_2&StrLanguageCode=EN&StrLayoutCode=HIERARCHIC#

and,

http://ec.europa.eu/eurostat/ramon/nomenclatures/index.cfm?TargetUrl=LST_CLS_DLD&StrNom=CPA_2_1&StrLanguageCode=EN&StrLayoutCode=HIERARCHIC#

Please let me know if there is something else you had in mind.

jindrichmynarz commented 8 years ago

I see. I seems that a colleague at UEP who prepared mappings between CPA and CPC did not use the original form of notation. I will fix this when converting the mappings to RDF.