linked-statistics / COOS

Core ontology for official statistics
Creative Commons Attribution 4.0 International
5 stars 5 forks source link

Finalization of the specification #42

Closed FranckCo closed 2 years ago

FranckCo commented 3 years ago

Help is needed on the following

JALinnerud commented 3 years ago

Franc, I am happy to give input to any existing texts in the specification, but I would have trouble starting alone with anything that is empty.

ChLaaboudi commented 3 years ago

Dear Franck, I can contribute to the global review. Could you send me the link to the spec? Christine

ChLaaboudi commented 3 years ago

Hi @FranckCo At my opinion, it will be clearer if we make the distinction between the overview (provided in Section 5) and the specifications. I would propose to organise the specifications' Section as follows:

Is there a way to integrate comments directly into the documents?

Best regards,

Christine

FranckCo commented 2 years ago

@flo7894 Introduction, using HLG-MOS product description @FranckCo Overview & improve text @FlavioRizzolo Conclusion?

FranckCo commented 2 years ago

Schedule:

FranckCo commented 2 years ago

Remaining at start of expert review

ChLaaboudi commented 2 years ago

Products If we assume that a coos:StatisticalDataset represents the Metadata (as a sub-class of dcat:Dataset); and the types of StatisticalDataset (DimensionalDataset, KeyValueDataset, TransposedDataset, etc) contains the statistical data,

As several of the types of StatisticalDataset can be used to structure the same data, a coos:StatisticalDataset might distribute (dcat:distribution) different type (1, *) of statistical data (DimensionalDataset, KeyValueDataset, TransposedDataset, etc) and their presentation (infographic, visualisation, publication)

In the same time, we should clarify the relation between coos:metadataFor (domain coos:StatisticalDataset) and dcat:distribution (domain dcat:Dataset).

Organisations About "shared global identifier for each of these organism" I would propose to follow the rules set up by the SDMX Guidelines for the Creation and Management of Code List CL_Organisation https://sdmx.org/?page_id=3215

Activities I would propose to add a subsection (below Adding GSIM and CSDA), <Relations between GSBM and GAMSO>, that covers the coos:supports and coos:uses properties.

FlavioRizzolo commented 2 years ago

I restructured the Intro with pieces from the conclusion and some narrative glue. I also separated Motivation from Background. The result is the following:

Motivation This paper introduces COOS, the Core Ontology for Official Statistics. COOS main purpose is to serve as an integration model for the core set of ModernStats standards backed by elements of well-known standard vocabularies. These ModernStats standards, mostly developed under the auspices of the UNECE High-Level Group for the Modernisation of Official Statistics, include the Generic Statistical Business Process Model (GSBPM), the Generic Activity Model for Statistical Organisations (GAMSO), the Generic Statistical Information Model (GSIM), and the Common Statistical Data Architecture (CSDA). As more statistical offices are turning to semantic standards to formalize their data and metadata, it became necessary to establish common foundations on which the different standards can develop in a coherent way using a formal framework that allows interoperability, machine-actionability and globally unique identification. ModernStats standards have been developed independently over the course of more than a decade by a diverse group of specialists with different viewpoints, stakeholders and ideas. This created misalignments and impedance mismatches between the underlying models that should otherwise work well together and complement each other: Information objects (GSIM) describe the data and metadata necessary to produce statistics where capabilities (CSDA) are the essential building blocks enabling activities (GAMSO) to be implemented via business processes (GSBPM). COOS defines a conceptual integration framework to provide semantic coherence across these models based on a common vocabulary of terms, definitions and a well-defined set of inter- and intra-model relationships formalized in RDF/OWL using standards vocabularies, e.g. SKOS, PROV, DCAT, DC, ORG, etc. COOS provides a powerful mechanism to describe complex aspects of statistical production to support business discussions and technical solution implementations. Model management is a big part of the standards integration story: the underlying models evolve and to maintain alignment the COOS needs to evolve with them. COOS includes an initial governance framework complementing each model’s own governance processes. This governance framework includes a core set of principles and a process for managing change.

Background GSBPM provides a framework to describe the building blocks of statistical production in terms of sub-processes. Its main goal is to help statistical organizations standardize their statistical production processes. It was the first ModernStats model to be published, back in 2008, and has been widely used by national and international statistical agencies since then. GAMSO provides a framework to describe the building blocks of statistical production in terms of activities. It complements the GSBPM in two ways: (i) by covering areas beyond the scope of GSBPM, and (ii) by providing a business capability view of statistical production itself. GSIM complements both GSBPM and GAMSO by providing a catalogue of information objects to describe statistical data and metadata. It functions as a reference framework consisting of a set of standardised information objects to be used in statistical production. CSDA provides a capability framework cataloguing the major abilities a statistical organization has to use, produce, share and manage data and metadata. CSDA integrates with the GSBPM and GAMSO by enabling processes and activities related to the lifecycle management of GSIM information objects.

FlavioRizzolo commented 2 years ago

The conclusion is now short, we may want to add few more details (or not).

Conclusion This paper introduced COOS, an ontology that serves as an integration model for the core set of ModernStats standards. During the development of COOS, it became evident that the natural misalignments and inconsistencies found among models developed independently are often diminished, and occasionally eliminated altogether, by the presence of a solid integration framework. Moving forward, we will investigate the feasibility and benefits of integrating other aspects of standards already included, e.g. objects of the GSIM Concept Group, and look into new standards and architectures, e.g. the Common Statistical Production Architecture (CSPA), the ESS Enterprise Architecture Reference Framework, and the European Interoperability Framework (EIF) and Reference Architecture (EIRA), and the Single Integrated Metadata Structure (SIMS), among others.

InKyungChoi commented 2 years ago

Regarding "task" described under sub-section 3.4 More detailed activities - how about adding a sentence or a footnote about the ongoing GSBPM "task" task team? Something like: The HLG-MOS Supporting Standards Group created an expert team working on creating the list of these “task” level activities, however, at the time of the development of COOS, the work is not yet completed hence could not be included in the COOS.

JALinnerud commented 2 years ago

The Motivtation currently has "As more statistical offices". Ought we to wait for resolution of https://github.com/linked-statistics/COOS/issues/65 ? Or could we just use the same as GSBPM for now, since it is the most adopted/familiar model so far? In that case we would use 'statistical organisation', not office/organization/institute/institution. Statistical organisation occurs 17 times in GSBPM 5.1, 40 times in GeoGSBPM and it is an essential part of GAMSO- Generic Activities Model for Statistical Organisations.

JALinnerud commented 2 years ago

Every time we use/refer to GSIM we are going to have a challenge with "information objects". For example see the Background above. Could we, in an overlapping/transition period that is designed to keep tracebility with previous versions, communications and implementations of GSIM, use something like "classes (information objects)" or "information objects (classes)"?

FranckCo commented 2 years ago

Decided during March 21 meeting: add a paragraph in conclusion mentioning all activities in HLG-MOS linked to COOS.

FlavioRizzolo commented 2 years ago

Updated Intro and Conclusion in the HTML doc (except for the inclusion of all activities in HLG).

FranckCo commented 2 years ago

Propose to close

FranckCo commented 2 years ago

4th July meeting agreed to close.