Closed craigappl closed 4 years ago
@craigappl - one strange property of the csv export -
"WHERE zambia_irs_structures.business_status NOT IN ('Not Visited');" - this doesn't really line up with any of the other reporting logic the web views use. It seems more in line with the intended use to just do a full structure dump, whatever the business status, so that all the metrics could be compared and re-computed if desired. Right now, for example, it's not possible to know if the "found" structure count is accurate because of this, but generally I just think it adds complexity to filter this way?
@gstuder-ona , The idea is to get an exact list of events in this export that match what's presented in the table counts in the Web UI dashboards. I don't know the underlying logic, nor have access to the queries. Ultimately, if the dashboard says 23,415 structures were sprayed for that province, the CSV export should have 23,415 rows of events (unless events are missing)
@craigappl oh, the export is structures, but each structure has a "latest" event attached, basically, which is how all the stats get configured. It's not so much a match/not match issue as the current export is a seemingly arbitrary subset. Doesn't seem necessary, as the subset isn't all that interesting afaik.
But this is more a side question, I actually don't think the count discrepancy comes from this.
So finally got some view updates - the core issue I believe is that views are updating on an arbitrary schedule, which can lead to (temporary) inconsistency in counts between the export and the web view. Eventually the two will converge, but while data is incoming there can be mismatches - the issue isn't an inconsistent calculation. Note that, for example, the current Luapula sprayed count is 23830 at 2PM EST 2019-11-14.
It is important to do the aggregation exactly the same way though - these two queries give the same results:
SELECT jurisdiction_name,
plan_id,
totstruct,
sprayedstruct
FROM zambia_irs_jurisdictions
WHERE (plan_id = '9f1e0cfa-5313-49ff-af2c-f7dbf4fbdb9d')
AND jurisdiction_name = 'Luapula';
-- from the export
SELECT COUNT(*)
FROM zambia_irs_export
LEFT JOIN jurisdictions_materialized_view AS jurisdictions_ex ON jurisdictions_ex.jurisdiction_id = zambia_irs_export.structure_jurisdiction_id
WHERE (plan_id = '9f1e0cfa-5313-49ff-af2c-f7dbf4fbdb9d')
AND jurisdiction_parent2 = 'Luapula'
AND business_status IN ('Partially Sprayed',
'Complete')
AND jurisdictions_ex.jurisdiction_depth >= 4;
@craigappl the best fix here I believe should be to better-order and time our view refreshes - it's unavoidable to have transient inconsistency in the way that we're doing this computation. I'll mark the ticket for review when I change the NiFi flows.
whoops didn't mean to close.
The fix for this is in-place in NiFi (cascading view updates vs round-robin), but it is currently blocked by onaio/canopy#591
Fix is in place, under review by Akros
@gstuder-ona @craigappl can we close this now?
This is superseded by onaio/canopy#611.
Akros has requested a data export so they can perform the final reports for VectorLink. The team is comparing the CSV export against the Zambia dashboards and the numbers are not matching. Here's what we are experiencing (in order of importance)
CSV Export as of 3:23PM PST 13/11/19 (Total Sprayed = 23372) https://drive.google.com/open?id=1OrSkfqHFboBTFWWwY8rxxZCs6off-qDe
Links