Closed FlavioRizzolo closed 1 year ago
I propose the creation of this channel specification pattern, which can be used by Exchange Channel sub-classes if necessary. The diagram shows the pattern applied to Questionnaire. It can easily be applied to Product in the same way (in fact, the pattern is inspired by Product)
Example for Product
We should probably rename Presentation to ProductPresentation and OutputSpecification to ProductSpecification
To address one of the issues raised last year, that a presentation shouldn't require and InformationSet, I think that creating a super-class InformationStructure would help. The new InformationSet would parallel what we have for ReferentialMetadataSet and DataSet so that we can talk about the structure of either with one class.
Proposed changes are outlined in red.
Extended pattern:
Proposed changes:
We could think of this pattern as related to the classical Model-View-Controller (MVC) pattern. (see MVC reference):
In our case, the InformationSet/Structure would be the model, the ChannelPresentation the view, and the Channel itself the controller. This is not a formal mapping, just another way of looking at these classes so that they make more sense.
According to the definition of ExchangeChannel, "The Exchange Channel is used for external and internal purposes", which means the collection and dissemination are just examples. However, that is not clear at all from other definitions and explanatory texts, or even the chosen extensions, e.g. Questionnaire, AdministrativeRegister, etc. We need to review the definitions and examples to make sure they clearly show the internal use case. We could add examples of data repositories for governance and harmonization, e.g. Data Hubs, Data Marts, which can be considered ExchangeChannels (much in the same way as AdministrativeRegisters).
An example of using the classes mentioned above in a process model: GSIM + process 20220309.pdf
To address one of the issues raised last year, that a presentation shouldn't require and InformationSet, I think that creating a super-class InformationStructure would help. The new InformationSet would parallel what we have for ReferentialMetadataSet and DataSet so that we can talk about the structure of either with one class.
Proposed changes are outlined in red.
Data Structure is required and should not be deleted, because of components that are specific Data Structure and not relevant to Information Structure. This would probably the case for Referential Metadata Structure:
For Data Resource and Referential Metadata Resource, there is no harm in leaving them as is. But if they are to be removed, the explanatory text should include that the Information Resource can be specialized in Data Resource and Referential Metadata Resource.
The proposed changes is shown below as a high level domain model,
The Model complies with most existing definitions of --protocol,protocol specification , product/product container , producer to a consumer. Need to add new destination-type to store or present . The main different is that the Exchange channel is defined as a Transport entity (even in the hub-spoke) context. It does not produce or consume, but is used by producer or consumer to transport/exchange the Product (information -set).
To address one of the issues raised last year, that a presentation shouldn't require and InformationSet, I think that creating a super-class InformationStructure would help. The new InformationSet would parallel what we have for ReferentialMetadataSet and DataSet so that we can talk about the structure of either with one class. Proposed changes are outlined in red.
Data Structure is required and should not be deleted, because of components that are specific Data Structure and not relevant to Information Structure. This would probably the case for Referential Metadata Structure:
For Data Resource and Referential Metadata Resource, there is no harm in leaving them as is. But if they are to be removed, the explanatory text should include that the Information Resource can be specialized in Data Resource and Referential Metadata Resource.
Of course. The proposal was to delete only the associations that are now inherited from the super-classes, not the classes themselves. At the class level there is no deletion, only the proposed addition of InformationStructure.
Exchange channel:
The Model complies with most existing definitions of --protocol,protocol specification , product/product container , producer to a consumer. Need to add new destination-type to store or present . The main different is that the Exchange channel is defined as a Transport entity (even in the hub-spoke) context. It does not produce or consume, but is used by producer or consumer to transport/exchange the Product (information -set).
I'm trying to see how this can work with the examples we have in a way that is not too system-oriented.
For instance, let's take the registers. A register doesn't seem to be a product that is transmitted or exchange, it's the actual means. What's transmitted is an information set extracted from the register. If the register is maintained in a relational DB and we connect to it via ODBC to run SQL queries, isn't ODBC the mechanism, hence the protocol? What's the channel then between the information and the consumer if not the register itself?
I've put together a tentative proposal integrating most of Khrishnan's and Andreas' ideas above, as best as I understand them.
The main change is the view of channel in a more traditional way as a transport/interface mechanism, which includes a Protocol, e.g. web service, FTP, face-to-face interview. A new class, tentatively named ExchangeHub, captures the container for the content exchange, e.g. Questionnaire, Register, Product, etc. Presentations for all channels/hubs are captured by a new ExchangePresentation class, where questionnaire modes, pdfs, webpages, etc. are represented.
Cardinalities need to be discussed in more detail once the over picture is more or less in place.
Regarding the proposal above:
Regarding the proposal above:
- How about just "Presentation" as the name of the class (currently named as) "Exchange Presentation"? I think "Exchange Presentation" sounds rather puzzling. If we are going to rename "Presentation" as "ProductPresentation", we can use the name "Presentation" for the superclass.
I agree.
- I wonder, with the new class "Exchange Specification" ("specifies all the component that might be necessary for an exchange to work"), if "Provision Agreement" is still needed?
Protocol and Provision Agreement are part of the specification, I think.
- I think the name "Exchange Hub" is quite confusing, is it a synonym of "Manager, Container, Organizer"?
It's a bad name, I just couldn't find a better one.
Manager, container and organizer are other options that came to my mind. It's where the content to be exchanged is maintained and organized, the capture or sharing tool. Essentially, it's the former ExchangeChannel minus the transport piece (send/receive). For instance, an electronic questionnaire would be the new ExchangeHub, the web page would be the new ExchangeChannel, HTTP+HTML would be the Protocol, the ProvisionAgreement would be the usual, and the ExchangeSpecification would be the design that puts all that together.
Provision Agreement informs the specification, it's not quite part of it. I think it is still needed as an its own entity to store the negotiated/agreed basis for exchange: retention, sharing agreement, etc. For ExchangeHub, what about ExchangeInformationContainer - It's long, I know? They are all containers for Information being exchanged where as Manager works more for Registers but not for Product, Questionnaire in my opinion. Also, DataHaverst now becomes a real ExchangeChannel. DataHarvest: A concrete and usable tool to pass information between two sources, usually by a machine to machine mechanism. It is not an "ExchangeHub", a container which act as a source or target to hold information.
Provision Agreement informs the specification, it's not quite part of it. I think it is still needed as an its own entity to store the negotiated/agreed basis for exchange: retention, sharing agreement, etc.
Yes, we definitely need the class, same as Protocol. I just meant that the specification as a design document might have the agreement as a part, but it might just be a supporting document informing it.
For ExchangeHub, what about ExchangeInformationContainer - It's long, I know? They are all containers for Information being exchanged where as Manager works more for Registers but not for Product, Questionnaire in my opinion.
Perhaps InformationContainer?
Also, DataHaverst now becomes a real ExchangeChannel. DataHarvest: A concrete and usable tool to pass information between two sources, usually by a machine to machine mechanism. It is not an "ExchangeHub", a container which act as a source or target to hold information.
I like the idea of DataHarvest being an ExchangeChannel with the new definition.
Ok, here is my last model:
We really don't need the InformationHub, but I think it's a nice way of representing where Registers and Data Hubs in general fit in the information exchange story.
I think this covers everything we discussed. Granted, the notion of Product is still different from what Andrea is using in her process model, but that can be addressed by just creating a wrapper class in the implementation model for InformationSet, InformationStructure and Presentation. Some impedance mismatch between GSIM and implementation models is expected, we just need to minimize it and ensure a mapping, e.g. via a wrapper class, is straightforward. In the end, we do need to define Product as an InformationExchange, giving the nature of Product, which includes dynamic content and online query tools.
Barring some minor changes, e.g. renaming, cardinalities, etc. this model should be it.
This looks OK to me. But I am still not sure the Administrative aspect of Information Exchange is being captured.
This looks OK to me. But I am still not sure the Administrative aspect of Information Exchange is being captured.
Yes, that connection is weaker now. There is a composition between InformationExchange and InformationHub though, which is as strong as you can get between two classes..
In the end, I think that trying to capture registers (a type of repository) as a channel/exchange entity is what got us into this mess. The idea has merit, but we need to stretch these notions too far to make them fit, so keeping information repositories (hubs) connected but separated seems more precise and clearer for most people.
We didn't discuss micro-data dissemination hubs, like PUMFs repositories, which are similar to registers from the exchange point of view, and many others, like dissemination databases, e.g. CANSIM, that function as a backend that can be accessed from a multitude of products. This model covers that case too.
Yes, I was actually thinking that we should have another subtype of InformationExchange for the administrative nature/type of exchange in addition to having the registers as InformationHub(s). We will have Questionnaire, DataHaverst, Administrative??? and Product.
Yes, I was actually thinking that we should have another subtype of InformationExchange for the administrative nature/type of exchange in addition to having the registers as InformationHub(s). We will have Questionnaire, DataHaverst, Administrative??? and Product.
If I understand correctly, you are proposing to change the composition between InformationExchange and InformationHub to an isA relationship.
The problem with that, I think, is that it would still make hubs a type InformationExchange, which is what created the original problem. We'll be putting together transport and content management, won't we?
Not at all... in addition to AdministrativeRegister for managing the content, there is an "Administrative Data Collection Tool" for bringing the information inside the organization.
Hi, here is another proposal:
Separate (Statistical and Administrative) Register and link it to Statistical Support: it was mentioned during our discussions on Statistical Program vs. Statistical Support (Program), maintenance of registers is a part of Statistical Support. I think linking Register with Statistical Support can indicate the level of management needed for Register. Note also that Register is linked with Information Set which is also linked to Product (see 3 below). I guess there is a lot of things that could be added around Register but for now, left it simple...
Separate Product: I think for similar reasons as Register, Product needs to be separated
Remove link between Product and Presentation, and add link with Information Set: as discussed before, Presentation becomes independent from content. Also, to make it re-usable and independent from any specific product, the description of association between “Output Specification” and “Presentation” changes from “defines” to “uses”.
Change the name “Exchange Channel” to “Exchange Tool” (or Instrument or Mechanism?): now that we are left with only “Questionnaire” and “Data Harvest” in “Exchange Channel”, we might need to update its definition and names accordingly. The current definitions of Questionnaire (“A concrete and usable tool to elicit information from observation Units”) and Data Harvest (“A concrete and usable tool to pass information between two sources, usually by a machine to machine mechanism”) indicate that they are more of concrete tool than abstract notion of exchanging information. Linking GSBPM-GSIM task team also raised an issue that there is no GSIM class that could represent a concrete application that we build in GSBPM Phase 3 (Build) and use in GSBPM Phase 4 (Collect) (Issue #4). Maybe we should push for "concrete tool" than "abstract notion".
Include “Exchange Specification” and remove “Protocol”: we need a specification for Exchange Tool, which can be output from GSBPM Phase 2 (Design) and be used to build Exchange Tool in Phase 3 (original purpose of this github issue :D). I removed “Protocol” because it seems now overlapping with “Exchange Specification” and “Exchange Tool” itself (but not sure..)
(Question) something for dissemination tool??: Questionnaire and Data Harvest are collection tools, and we do not have any concrete dissemination tool. Perhaps mini web-sites..? but this sounds too concrete..
- Change the name “Exchange Channel” to “Exchange Tool” (or Instrument or Mechanism?): now that we are left with only “Questionnaire” and “Data Harvest” in “Exchange Channel”, we might need to update its definition and names accordingly. The current definitions of Questionnaire (“A concrete and usable tool to elicit information from observation Units”) and Data Harvest (“A concrete and usable tool to pass information between two sources, usually by a machine to machine mechanism”) indicate that they are more of concrete tool than abstract notion of exchanging information. Linking GSBPM-GSIM task team also raised an issue that there is no GSIM class that could represent a concrete application that we build in GSBPM Phase 3 (Build) and use in GSBPM Phase 4 (Collect) (Issue Missing GSIM class – concrete representation of Exchange Channel and Business Process #4). Maybe we should push for "concrete tool" than "abstract notion".
I agree with this, but I think we are still missing a third option for administrative data, not a register, but as a type of information brought into the organization from an external organization, a part from Data Harvest channels which include web scrapper, API, scanner, sensor, satellite, etc.
Updated version:
Suggestion for definitions of the added/changed classes:
Exchange Instrument
-> Now I am thinking whether it is easier if we merge Protocol and Exchange Instrument (if the latter is really for "concrete/usable tool, not an abstract notion")
Data Harvest
Questionnaire
Exchange Specification
Information Structure
Dissemination Component / Instrument ?
Updated version of the figure: Text on association from Output Specification to Presentation was "defines" in GSIM v1.2. Now it is 'uses' - incorrect?
@JALinnerud it was an intentional change, I thought if we are going to make Presentation independent from Product and be able to exist on its own without Product, Output Specification no longer "defines" Presentation, but rather "uses" (existing) Presentations
Explanatory text for Data Harvester in GSIM v1.2 was " Examples of Data Harvest channels include web scraper, API, scanner, sensor, satellite, etc. " I think these were useful examples and hope we can keep them.
@JALinnerud I will add "Examples of Data Harvest channels include web scraper, API, scanner, sensor" in the explanatory text! I would exclude "satellite" though, as satellite is more than "data harvest" tool, and it is essentially sensors on satellite that capture the signal data
Here is the updated version
I think Register is kind of lonely up there. Even though they are not channels, they still are means of sharing information and need to be linked to Provision Agreement and some sort of "interface" specification, similarly to Product. Perhaps Exchange specification doesn't need to be linked only to Exchange Channel... it seems to me we are missing some generalization here.
Other than that, I think it works.
Updated version based on meeting notes #30 (relationship between Register and Statistical Support removed)
Ready to be modelled in EA
There is a lack of GSIM information object to describe the specification of Exchange Channel in general (c.f., there is Questionnaire Specification and Output Specification for Questionnaire and Product which are the sub-types of Exchange Channel)