linked-statistics / COOS

Core ontology for official statistics
Creative Commons Attribution 4.0 International
5 stars 5 forks source link

Feedback from Jay Greenfield on the COOS main document #69

Open nightcleaner opened 3 years ago

nightcleaner commented 3 years ago

Feedback from Jay Greenfield on the COOS main document .docx

FranckCo commented 3 years ago

For simplicity, I copy Jay's feedback below:

The level of specificity across COOS is very low. This is definitely OK to begin with (for starters).

However, is there a next level of specificity that is planned for UNECE COOS?

Take, for example, a transposed information object. Following PROV, its name suggests that the transposed information object prov:wasDerivedFrom another InformationObject.

This derivation is the product of an activity that uses another InformationObject.

Is that kind of activity a csda-data-transformation?

Alternatively, can there be a transposed information object that exists that is not the product of a csda-data-transformation? In other words, does the rectangular information object necessarily have primacy?

One way to give teeth to prov:wasDerivedFrom is to associate a COOS data transformation activity with the matrix algebra. Here the matrix algebra might be a specialization of csda-data-transformation activity.

On the surface of COOS, everything seems quiet. Beneath the surface there may be riddles yet to be encountered. Do you have a list of riddles (i.e., use cases) whose resolutions will give COOS more teeth (more specificity)?

By way of motivating the structural dataset types (Figure 7), you argue that it isn’t really possible to specify a dataset domain (because of transformations). At the dataset level are we now off the hook when it comes to semantic annotation?

Is this true in the case of event histories? More generally, can individual products of the matrix algebra have a tractable semantic annotation? Would this be the field of symbolic dynamics?

Maybe rock bottom is a row in an event history where a row necessarily has a “what” together with (optionally) a “who”, “when”, “where” and/or a “how”. Rows might transform under a “shift operator” and the result would be a sequence of symbols. From this perspective a datum has a what qualified by a who, how, when and/or where components and corresponds to a discrete interval (a single symbol with qualifications) in an infinite sequence of qualified symbols.

From this perspective rectangular datasets and SQL databases are just a storage mechanism, yes, indifferent to meaning.

In general, you may want a few more examples of how products are used and produced by activities and tasks. Could one of these examples present a StatisticalProgram. It would consist of a series of activities and/or tasks that produce information objects subsequently used by other activities/tasks. If so, could we use a GAMSO individual to allocate an IT platform that supports the StatisticalProgram? If so. could the example include a more or less specific platform description of a Container as a Service (CaaS) using an ontology like the Container Description Ontology. For the sake of FAIR, I think we ought to demonstrate COOS can represent a workflow that is supported by a platform we can describe more or less.

FlavioRizzolo commented 2 years ago

I think the issue of specificity is a critical to discuss . How deep do we want to go? in which areas of the model? which use cases will be driving this exercise? I think Jay indicates a couple of potential directions:

FlavioRizzolo commented 2 years ago

In terms of use cases to apply and refine COOS, a few months ago we mentioned record linkage, which I think would give us enough specificity for the data integration and provenance/lineage type of problems (and links to CSDA transformations).

Record linkage comes also with a GSBPM extension proposal, the Record Linkage Project Process Model, which raises the question of whether we want to map such a process model to COOS as well...

FranckCo commented 2 years ago

Following 9/7 meeting: set up dedicated meeting with people interested, tag for version 2.