Trigger updates to downstream domains for related data

mseaton commented 1 year ago

As Initializer has evolved, various domains have emerged that have overlapping responsibilities. For example, the metadatasharing, ocl and concepts domains are all commonly used to load Concepts into a system, and some implementations choose to make use of more than one of these at at time. Similarly, the location and locationtagmaps domain both take responsibilities over assigning location tags to locations.

In these scenarios, unexpected results can arise when an upstream domain has new changes but a downstream domain does not. As an example, a distribution may bring in most of it's Concepts via ocl or metadatasharing, but also allows for some concepts to be managed via the concepts domain (eg. diagnosis lists, drug lists, etc). In the typical way this is set up, the ocl or metadatasharing domain is expected to be installed first, and then the concepts domain is expected to be installed second, and this knowledge may be leveraged in order for the concepts domain to make some country-specific tweaks to a shared, global concept package (eg. some custom mappings, or custom name translations).

However, in this scenario, if an update is later made to the ocl or metadatasharing domain but not in the concepts domain, then the upstream concept definitions will be applied to the server, and the downstream customizations will not re-run, leaving metadata in an unexpected and likely problematic state.

It would be helpful if Initializer could provide a solution to this problem. This could potentially be something in the runtime configuration, where an implementation could define a list of domains that should reload if a given domain changes. For example:

initializer.domain.metadatasharing.dependencies=concepts,conceptsets
initializer.domain.ocl.dependencies=concepts,conceptsets

The result is that any time the metadatasharing or ocl domains change, Initializer will also force an update (eg. delete existing checksums) of the specified domains in the list, even if there are no changes to the files in those domains.

@mogoodrich / @mks-d / @ibacher / @Ruhanga interested in feedback.

mks-d commented 1 year ago

Thanks @mseaton for outlining your use cases so clearly.

So you mean to resolve the issue of implicit domain dependencies by making them explicit through a declaration in a config file?

How configurable does this really need to be? Isn't it that those implicit dependencies can be identified one and for all through some business analysis and should be taken care of anyway? As in for example that Iniz will always reload MDS packages when changes are made to the concepts or the concept sets?

mseaton commented 1 year ago

Thanks @mks-d . It's a good point. Certainly if we can all agree on a set of behaviors then it doesn't need to be configurable. I do think the use cases should be pretty well fixed to those known by iniz upfront. MDS is potentially one that is less so, because any type of metadata can theoretically be packaged in an MDS file, not just concepts. But if we can agree that MDS is, for all practical terms, just there for legacy support of Concepts (and linked data like Concept Sets, Answers, Sources, Mappings, Concept Classes and Datatypes), then I'd be perfectly happy just covering the use cases laid out with reloading the Concept domains if the MDS domain has any changes, doing the same for the OCL domain, and reloading the locationtagmap domain any time the location domain changes.

mks-d commented 1 year ago

If that makes development for you easier, then that sounds like a reasonable approach. Also it will require less documentation ;-)

ibacher commented 1 year ago

Certainly if we can all agree on a set of behaviors then it doesn't need to be configurable.

I think there are fewer headaches if it's not configurable and the dependencies between domains are generally known at the time the domain is added, i.e., when the OCL domain was added, we set it up to run before the concept domain so the concept domain could tweak the OCL data if required.

Making the resets configurable adds some headaches like: initializer.domain.htmlformentry.dependencies=concepts, which I point out just as a strong case (IMHO) for why configuration of this is undesireable.

The one thing it might be worth thinking through is how this feature interacts with implementations ability to, in essence, customise the order in which dependencies are run.

mseaton commented 1 year ago

An alternative and/or complement to this feature would be if Initializer were able to expose an API that could be called before loading, to determine which, if any, domains have changes in them. This would enable distributions that invoke Iniz programmatically to check which files (in which domains) will be loaded due to change of checksums, take steps to then clear out checksums as needed in other domains based on this, and then invoke iniz.

mekomsolutions / openmrs-module-initializer

Trigger updates to downstream domains for related data #233