Improve accuracy of target document after interval turnover

kennsippell commented 1 week ago

Is your feature request related to a problem? Please describe. Target documents have a monthly interval. Because target documents are not always written (max of once per day, and only when viewing targets tab) they can be stale. When the interval ends, there is some "lazy best effort" to ensure that the previous interval's data is updated.

The digital payment system relies on the data in the previous month's target document. This month, there was a lot of variance between the "ideal" result (what would actually be calculated by targets.js for the month) vs the "actual" data written to the target document (the result of this interval turnover).

This feature suggests improvements to target documents so at least the previous month's target document is guaranteed to be the result of some deterministic predictable calculation.

Describe the solution you'd like Can we recalculate all emissions during the interval turnover process before writing the values? Can we run the interval turnover function even if the previous month's target document doesn't exist? Or could we update the target document every time rules emissions are updated instead of daily when targets are viewed?

Additional context Some scenarios

Does activity on 24th of the month. Then views a different contact on 1st of next month without going to tasks tab. Their activity is not included in the target document.
User does activity on 30th of the month. Views the tasks tab. This will work great! Except user changes devices on the 1st of the month and interval turnover happens on the new device (actual scenario from telemetry)

dianabarsan commented 1 week ago

(max of once per day, and only when viewing targets tab)

One mention here that reloading the app will also write the doc to disk.

Can we recalculate all emissions during the interval turnover process before writing the values?

Unfortunately, because a lot of targets emit "now", the emission date for the target when we recalculate will be on the next interval, and this will result in a 0 target count for the past interval.

Can we run the interval turnover function even if the previous month's target document doesn't exist?

I think this scenario can only exist when the user has not used the device during the past interval. The interval turnover calculation uses the untouched stale rules engine cache to overwrite past month's target doc. If this cache does not exist, then nothing can be written. Is an empty target document desirable here?

Or could we update the target document every time rules emissions are updated instead of daily when targets are viewed?

I believe this is the best and maybe only viable option.

kennsippell commented 2 days ago

One mention here that reloading the app will also write the doc to disk

So interval turnover happens when the app loads, but the routine writing of target documents doesn't trigger then. I think that's what you mean and we're on the same page?

Unfortunately, because a lot of targets emit "now"

Do you mean the target has date: 'now'? If we consider both potential values for date:

date: 'reported' - As long as targets emit with an appropriate date, then it should be counted and filtered correctly. Digital Payments relies on two targets with this setting and so this is mostly the scenario I have been caring about during this investigation. AFAIK emissions (iff present) are counted properly even if refreshed after the interval turnover.
date: 'now' - I think this setting is intended to "count all emissions regardless of their date." The risk I see here is that a report from the new interval could be counted in the previous interval. The interval turnover happens a lot ... but (for example), if the interval turns over and then a user creates a report via the reports tab, then they could erroneously add to a previous interval's target value. There are some other scenarios like this like documents created server-side.

I think this scenario can only exist when the user has not used the device during the past interval.

~In Nairobi eCHIS Kenya, there were 400 users without an August target document. Of those, 51 had household visits or referrals in the month of August. I'm trying to understand this gap. According to Superset, some of these users had over 100 home visits and over 51 referrals. I'm hoping we can find why no target doc was written for these users. It seems to be responsible for about 15% of our digital payment data quality problems.~ My mistake. I thought this was a core issue, but was actually a result of multiple user accounts for the same facility_id and contact_id. Data is now consistent with the claim that this seems to only happen when the user is actually just totally not engaged.

I believe this is the best and maybe only viable option.

If memory serves me, the reason we didn't do this initially was that it may cause some noisey data synchronization and potentially conflicts if users login to multiple devices. If we are happy to ignore or address those concerns now, I think pursuing this would be a great solution for this issue.

dianabarsan commented 2 days ago

So interval turnover happens when the app loads, but the routine writing of target documents doesn't trigger then. I think that's what you mean and we're on the same page?

It does. https://github.com/medic/cht-core/blob/a2fe0b43d4161d80c93d7fd8980d56afcaea01b6/shared-libs/rules-engine/src/provider-wireup.js#L295

Do you mean the target has date: 'now'

Yes. When we "ask" the rules engine to give us emissions, targets with emission date of now will emit for the current date. Then, we filter out all of the emissions that are not within our desired interval - this is how all targets calculations work: https://github.com/medic/cht-core/blob/a2fe0b43d4161d80c93d7fd8980d56afcaea01b6/shared-libs/rules-engine/src/provider-wireup.js#L144 . So when the interval turns over, we will essentially filter out ALL targets that emit for now, because they're not in the current interval.

may cause some noisy data synchronization and potentially conflicts if users login to multiple devices

This is correct. But to get higher accuracy, we need to commit these target docs more frequently. I believe another issue would be the actual recalculation impact. For tasks we have a kind of "incremental" approach where we only recalculate tasks for a given contact, but when we recalculate targets, I don't think we have a way to reduce the calculation context to avoid doing a full run every time.

I'll be looking into this later today.

dianabarsan commented 1 day ago

I don't think we have a way to reduce the calculation context to avoid doing a full run every time.

We do indeed only recalculate targets for the updated context, not a full run every time. And the target state gets updated whenever we refresh rules-engine state, this includes every time we recalculate tasks. However, we only aggregate it when the user purposefully loads targets and on interval turnover.

dianabarsan commented 1 day ago

date: 'now' - I think this setting is intended to "count all emissions regardless of their date."

Correct, but the emission itself gets a timestamp as a date, the now is computed to someDate which happens to be the date when the emission is actually computed: https://github.com/medic/cht-conf/blob/main/src/nools/target-emitter.js#L30

So, if we're calculating on Jul 5th, all now emissions will emit for July 5th. And these will be excluded by a date filter that is for Jun 1 -> Jun 30 (previous interval).

kennsippell commented 5 hours ago

Or could we update the target document every time rules emissions are updated instead of daily when targets are viewed? I believe this is the best and maybe only viable option.

Curious if your plan is to remove interval turnover as part of this change?

the emission itself gets a timestamp as a date

True, but also seems solvable. Could we add a flag to the emission like countAlways: true (or now: true etc) and only set the date for backward compatibility? Obv projects would need both a core and conf update to fix the bug, but it's maybe a clearer way to indicate the intent to "count all emissions regardless of their date".

If interval turnover is still required, this could maybe provide an alternate path to implementing this where-by we refresh emissions as a step during interval turnover.

medic / cht-core

Improve accuracy of target document after interval turnover #9431