terraframe / geoprism-registry

GeoPrism Registry is a system for curating interlinked data through time. It's the first framework implementing the Common Geo-Registry specification.
https://geoprismregistry.com/
GNU Lesser General Public License v3.0
18 stars 5 forks source link

Cannot synchronize closed Geo-Objects due to display label not available after closing date #886

Closed rfromthecastle closed 1 year ago

rfromthecastle commented 1 year ago

List the CGR information

Describe the bug (clear and concise) I am unable to synchronize closed health facilities as they lack a display label to map against the name attribute of the corresponding organization unit at the current date.

To Reproduce

  1. As a Registry Maintainer, run a synchronization for a Geo-Object Type containing closed health facilities.
  2. The following error is thrown: The DHIS2 attribute(s) [name] require a value, but one was not mapped. Make sure that the Geo-Object has a label in either your preferred locale, or your default locale at the date [Present].

Expected behavior (clear and concise) Closed Geo-Objects successfully synchronize with the corresponding organization units, with the closing date of the organization unit updating with the Geo-Object's existence end date.

Screenshots image

Desktop (please complete the following information):

Additional context (if any) None

rrowlands commented 1 year ago

Hi @rfromthecastle

I'm having a few issues while investigating this ticket.

  1. If a Geo-Object is synced with an exists period of (for example) 2020-04-04 to 2021-04-04 and does not exist on the current day of export, the data with the most recent value is chosen. This means that your Geo-Object probably does not have any value defined for the locale in question (regardless of the time dimension).
  2. I'm unable to find the error in question on the Laos server so I can't find additional information on my own. (such as: what is the preferred locale set to in the sync config? What do the display labels look like for the Geo-Object in question? What do the exist periods look like?)

That being said, one could make the argument here that the display label shouldn't be required for export when updating an existing object in DHIS2 (only when creating new objects since it is required to create a new organisation unit in DHIS2). I'm certainly open to this idea and curious what your thoughts here are.

rfromthecastle commented 1 year ago

Hi @rrowlands, your explanation at point 1 makes sense. For point 2, an example you could look at is Banhan Health Center (H0629), which closed in August 15, 2021. We have some historical data, such as updated coordinates, that we'd like to synchronize with the corresponding (closed) organization unit in DHIS2, as there is programmatic data from the past still attached to it. As for the settings in the 'Dev HMIS - Health Centers' synchronization configuration, when the error occurred I checked the 'Sync Non-Existent Geo-Objects' and set the preferred locale to English.

On your suggestion at the bottom of your comment, I think this could resolve the issue of missing display labels for closed organization units, however, what then happens if for some reason the display label is empty for an existing Geo-Object? I don't think we'd want to synchronize the display label then?

rrowlands commented 1 year ago

@rfromthecastle

So when you synced and got the error, I am assuming that you didn't currently have an English display label for that Geo-Object? I can see now that it currently does and as a result it should be able to sync even if it doesn't exist on the current date.

OK so I think for this ticket I can make the system not sync any display label information if one does not exist for the preferred locale and if the object already exists in DHIS2. The reason this works is because "name" is a required field in DHIS2, and even if the label is null in the CGR, we're not able to remove the name from DHIS2 anyway. I do wonder however if it actually makes sense in this scenario to present a warning to the user? The rationale being that if the object exists in DHIS2 it must have a name, however that name may not correspond to the preferred locale, however if the current name doesn't match the current locale then the data in DHIS2 has drifted a bit from the data in the CGR, hence the warning.

rfromthecastle commented 1 year ago

Hi @rrowlands, I just thought of this scenario: what if there is a health center that has already been synchronized, but that closed a week ago. We then update the existence end date of the Geo-Object and update all attribute validity periods accordingly (including for the display label). If I run the synchronization today, will GeoPrism Registry then be able to update the end date of the corresponding organization unit in DHIS2, even though a display label no longer exists for today in GeoPrism Registry?

SteeveEbener commented 1 year ago

@rfromthecastle : what you are suggesting in your last comments goes against one of the primary functionalities of the GPR which is the proper management of the time dimension.

It seems to me that the issue here is linked to DHIS2 not being able to manage the time dimension for org units.

If the hierarchy in DHIS2 is meant to represent the situation:

If the above is correct, we should not try to alter the correct approach implemented in the GPR to accommodate for a flaw in DHIS2 but highlight this flaw

rrowlands commented 1 year ago

Hi @rfromthecastle and @SteeveEbener, I apologize for the confusion, it appears we are still not on the same page as to the current synchronization behaviour of the CGR.

  1. When an attribute (such as display label) is exported to DHIS2, the value which gets used is the most recent value available, not the value for the day of export. We can have a discussion about whether or not this is the desired behaviour.
  2. We don't currently have any behaviour which writes to DHIS2's "closedDate" attribute. We can consider populating this based on the Geo-Object's exists attribute as part of this ticket.
  3. With regards to @SteeveEbener's comment, again there is an assumption here that the date of synchronization is "today". This is at odds with how the system currently works: the date of synchronization is "null" (localized in the UI as "present"), which the system treats as "most recent date at which data is available".

The decision to pull "most recent value" instead of "value on current day of synchronization" was made exactly for the original use-case described in this ticket, which is further supported by our "Sync Non-Existent Geo-Objects" option in the sync config.

SteeveEbener commented 1 year ago

@rrowlands Thanks for your comment and the clarification regarding the handling of the time components on the GPR side.

Few question:

Thanks

rrowlands commented 1 year ago

What is the "most recent date at which data is available" in the case described by Rica (health facility closed on August 15, 2021)? Is it August 15th (meaning a value is returned) or today's date (meaning "No Data")?

August 15th, 2021

Does DHIS2 capture the temporal validity of the information stored in the hierarchy?

I'm not certain on this question. We do know that it tracks the "closed date" of a Geo-Object. Beyond that I would think no.

I would like to propose a desired behaviour of the system, curious on your thoughts @SteeveEbener and @rfromthecastle, so please pardon a little psuedocode here:

if (geoObject.exists(today)) {
  dhis2Value = geoObject.getValue(today);
} else {
  dhis2Value = geoObject.getValue(geoObject.getMostRecentExistDate())
}

The geoObject.getMostRecentExistDate() function would return, in Rica's example, August 15th, 2021, and it represents the last date that the GeoObject existed.

The advantage of this new behaviour would be that it would allow people to sync explicit null values over to DHIS2.

SteeveEbener commented 1 year ago

Thanks @rrowlands .

I think we need to understand what is the temporal validity of the information stored in DHIS2 (in general and in Laos) before being able to decide if what you are proposing would be applicable. @rfromthecastle ?

rfromthecastle commented 1 year ago

Hi @SteeveEbener and @rrowlands, how the temporal validity of the organization units in DHIS2 works is that

  1. Each organization unit has an opening date (equivalent of the existence start date in GeoPrism Registry)
  2. Even though a closing date can be added to an organization unit (equivalent of the existence end date in GeoPrism Registry), that organization unit is still displayed in the hierarchy, so that it can be selected by users when making pivot tables or charts with historical programmatic information associated with that organization unit. In the case of the Lao HMIS, closed organization units are prepended with an X symbol
  3. All attributes on the organization unit are the latest available value - for example, if a still existent organization unit has its name changed, the name will need to be replaced with the new name

I like @rrowlands ' proposal, though I think for closed Geo-Objects it should only be the end date/closing date that is synchronized and nothing else - due to closed organization units still displaying in the DHIS2 hierarchy, it wouldn't make sense to have them with NULL or empty labels in DHIS2.

SteeveEbener commented 1 year ago

Thanks @rfromthecastle . Is what you are describing rules that apply to any DHIS2 instances or only the Lao one?

I am not sure to understand point 2 as my understanding is that only tracked entities are carrying historical programmatic information, not org units => does DHIS2 indeed allow to select org units that do not exist anymore for creating graphs and tables?

In any case, I agree with your last paragraph ( for closed Geo-Objects it should only be the end date/closing date that is synchronized and nothing else - due to closed organization units still displaying in the DHIS2 hierarchy, it wouldn't make sense to have them with NULL or empty labels in DHIS2)

rfromthecastle commented 1 year ago

Hi @SteeveEbener, these are applicable to all DHIS2 instances except for the note at the end of point 2.

DHIS2 indeed allows you to select closed organization units for creating pivot tables and charts. See this example, where the two selected health centers closed at the end of 2019:

image

SteeveEbener commented 1 year ago

Thanks @rfromthecastle. If the end date/closing date is indeed used by DHIS 2 in this context then we should go for the option you mentioned and which I highlighted in my last comment: " for closed Geo-Objects it should only be the end date/closing date that is synchronized and nothing else - due to closed organization units still displaying in the DHIS2 hierarchy, it wouldn't make sense to have them with NULL or empty labels in DHIS2"

rrowlands commented 1 year ago

Hi @SteeveEbener and @rfromthecastle. I would like to propose a plan for closing out this ticket. If the following pseudocode is implemented in the GPR for sync:

if (geoObject.exists(today)) {
  dhis2Value = geoObject.getValue(today);
} else {
  dhis2Value = geoObject.getValue(geoObject.getMostRecentExistDate())
}

I believe it will handle all the concerns that have been brought up in this ticket. As you can see in this pseudocode, if the Geo-Object does not exist on the date of export, the system uses the data for the last day that the Geo-Object did exist. Given that as @SteeveEbener mentioned that DHIS2 doesn't support "data over time", I think that this behavior is the most in line with the most common way OrgUnits are used in DHIS2: sync the values for whatever the most recent date the Geo-Object existed.

This logic will only result in null labels if the labels are actually null in the GPR (which could be considered a defect in the data).

Given that our system already does a more primitive version of this logic, I want to make clear to @rfromthecastle: I think that when you synced your Geo-Object did not have any display label (for the English locale), which is a very different situation than simply not having a display label on the day of export.

SteeveEbener commented 1 year ago

@rrowlands : Unless I am misunderstanding something, what you are proposing is not giving the expected result mentioned in my comment dated March 29.

rfromthecastle commented 1 year ago

Hi @rrowlands, does the pseudocode just cover syncing the opening/closing dates for Geo-Objects that no longer exist on the current date? In that case I'm good with it.

think that when you synced your Geo-Object did not have any display label (for the English locale), which is a very different situation than simply not having a display label on the day of export.

Not sure what you mean here? The original example (Banhan Health Center) did have a display label, but only from 1975-12-02 to 2021-08-15.

rrowlands commented 1 year ago

Not sure what you mean here? The original example (Banhan Health Center) did have a display label, but only from 1975-12-02 to 2021-08-15.

This is the root of my confusion on this issue. Our system should already handle this scenario just fine. It should (and does when I test it) use the display label for the '2021-08-15' value. The only way I can see you getting this error is if it didn't have any value for any time period (on the english locale).

Is this issue consistently still happening or did it only happen once?

rfromthecastle commented 1 year ago

I think I know what happened - Banhan Health Center was never in the original GPR health facility baseline import as it was already closed when we imported the list. We then added it in, in an effor to reconstruct historical changes of the HFML for Laos, and I tried to also sync this health center with DHIS2 even when no longer existent, so that in case it opens again in the future, it will be easy to push the change. What I'll try to do is mark it as existent for today's date, sync it, then close it and sync again and see what happens.

rfromthecastle commented 1 year ago

I've tried what I described in the previous comment, but, even though the error message no longer shows up, nothing from the closed Geo-Object synchronizes. The corresponding DHIS2 organization remains "open", i.e., the closing date remains empty. Therefore, I still think what @SteeveEbener suggested is needed:

Thanks @rfromthecastle. If the end date/closing date is indeed used by DHIS 2 in this context then we should go for the option you mentioned and which I highlighted in my last comment: " for closed Geo-Objects it should only be the end date/closing date that is synchronized and nothing else - due to closed organization units still displaying in the DHIS2 hierarchy, it wouldn't make sense to have them with NULL or empty labels in DHIS2"

rrowlands commented 1 year ago

Hi @rfromthecastle, I just checked on the laos server and the "Dev HMIS - Health Centers" sync configuration currently has the "Sync Non-Existent Geo-Objects" option disabled. Was this disabled when you synced? Can you try re-enabling it and running the sync process again and see if that fixes it?

rfromthecastle commented 1 year ago

Silly me! Forgot to check that box. Tried running it again with the box checked, but now running into #909 so I can't see if this resolves it.

rfromthecastle commented 1 year ago

Closing as determined as not being a bug