socialappslab / denguechat

Using mobile phones and gaming tactics to engage citizens in reporting public health risks
https://www.denguechat.org/
Other
0 stars 0 forks source link

Finalize graphic representation of data #416

Closed dman7 closed 9 years ago

dman7 commented 9 years ago

The current time-series unfortunately obfuscates how many locations were positive or potential before being eliminated. This is a common problem if the location is eliminated the same day as it's identified. In my correspondence with Marco:

"I'm starting to understand what Harold wants, and I think we're converging on the same solution. To make sure we're on the same page, I see the following problem:

My initial thought is that what we need here is to track the time of the day that a location was identified positive, potential, and time of day it was eliminated. Doing so will help us because:

This changes our time series of locations slightly: suppose a location was identified POSITIVE on day T. There are three scenarios:

  1. Location was not eliminated In this case, the location remains POSITIVE for [T, T+7], including the day T+7.
  2. Location was eliminated same day (day T) In this case, the locations remains POSITIVE for day T, and then becomes ELIMINATED starting day T+1.
  3. Location was eliminated on day T+7 In this case, the location remains POSITIVE for [T, T+7). Note that the location is labeled ELIMINATED starting day T+7. This may be an arguing point: should it remain POSITIVE on day T+7 since the visit took place on day T+7?"

The solution is to introduce a time element to location_statuses that

We should also deprecate the status column as the "status" will now be calculated by comparing identification_type, identified_at, and cleaned_at.

dman7 commented 9 years ago

What's left to do:

dman7 commented 9 years ago

More work needs to be done per this Saturday's meeting. See issue #425 for talking points.

dman7 commented 9 years ago

Based on #425, here is the proposed algorithm update:

The last bullet point raises an interesting question: do the brigade members record a follow-up visit in "identification date" column or only in "elimination date" column? @brujonildo , can you shine light on this issue? If an identification visit occurs on "2015-01-27", and 3 sites are identified, then there will be 3 row entries in the CSV report (that's agreed upon). Now suppose a follow-up visit takes place on "2015-01-28". Is there a new row entry with "2015-01-28" in "inspection date", or does that visit just update "elimination date" for the existing 3 rows?

dman7 commented 9 years ago

Consider the following chart of Francisco Meza on 2015-02-02:

screen shot 2015-02-02 at 12 00 48 pm

The 2% on 2015-01-11 is the number of houses that were identified as positive on the inspection date relative to the total number of houses in the neighborhood. This is a misleading metric. What useful information is it telling us? Not much. It only tells me how that day's work was relative to the total. Not a very good metric. Instead, we should add a toggle with these metrics instead:

For instance, consider the following example: Suppose we have 100 houses. On 2015-01-11, you find 5 positive houses. On 2015-01-12, you find 5 houses with positivity (different houses). Naturally, we expect 2015-01-11 to be 5% positive and 2015-01-12 to be 10% positive. The above graph and the underlying algorithm return 5% on 2015-01-11 and 5% on 2015-01-12. Confusing...

And we should add the following labels:

Finally, I also want to test the different filters when we don't have cookies set:

dman7 commented 9 years ago

There are one final thing to make sure:

dman7 commented 9 years ago

After receiving Harold's data, I think it's best to proceed with the filters by removing most filters and only keeping "Positive", "Potential" and "Daily metric". Why? Harold's graphs contain only those without a hint to initial-versus-followup visits. The graphs also do not contain a cumulative metric.

dman7 commented 9 years ago
dman7 commented 9 years ago

We're reached a satisfying checkpoint with the charts. Here is the 1 month view:

screen shot 2015-02-13 at 5 01 03 pm

And here is the 6-month view:

screen shot 2015-02-13 at 5 01 14 pm