Closed dman7 closed 9 years ago
What's left to do:
identification_type
, identified_at
and cleaned_at
of existing LocationStatus,location_statuses
to visits
More work needs to be done per this Saturday's meeting. See issue #425 for talking points.
Based on #425, here is the proposed algorithm update:
visit_type
to visits
table. This allows us to separate "inspection" visit from "follow-up" visit. The purpose of these types of visits are different.status
column,identified_at
columns and cleaned_at
column. These should be replaced by visited_at
column,identified_at
and cleaned_at
. We're no longer using these columns to identify the status of a visit.csv_reports/index.html
page with new statistics conventions,larvae
and protected
columns.Visit
model,The last bullet point raises an interesting question: do the brigade members record a follow-up visit in "identification date" column or only in "elimination date" column? @brujonildo , can you shine light on this issue? If an identification visit occurs on "2015-01-27", and 3 sites are identified, then there will be 3 row entries in the CSV report (that's agreed upon). Now suppose a follow-up visit takes place on "2015-01-28". Is there a new row entry with "2015-01-28" in "inspection date", or does that visit just update "elimination date" for the existing 3 rows?
Consider the following chart of Francisco Meza on 2015-02-02:
The 2% on 2015-01-11 is the number of houses that were identified as positive on the inspection date relative to the total number of houses in the neighborhood. This is a misleading metric. What useful information is it telling us? Not much. It only tells me how that day's work was relative to the total. Not a very good metric. Instead, we should add a toggle with these metrics instead:
For instance, consider the following example: Suppose we have 100 houses. On 2015-01-11, you find 5 positive houses. On 2015-01-12, you find 5 houses with positivity (different houses). Naturally, we expect 2015-01-11 to be 5% positive and 2015-01-12 to be 10% positive. The above graph and the underlying algorithm return 5% on 2015-01-11 and 5% on 2015-01-12. Confusing...
And we should add the following labels:
Finally, I also want to test the different filters when we don't have cookies set:
There are one final thing to make sure:
calculate_cumulative
method).After receiving Harold's data, I think it's best to proceed with the filters by removing most filters and only keeping "Positive", "Potential" and "Daily metric". Why? Harold's graphs contain only those without a hint to initial-versus-followup visits. The graphs also do not contain a cumulative metric.
We're reached a satisfying checkpoint with the charts. Here is the 1 month view:
And here is the 6-month view:
The current time-series unfortunately obfuscates how many locations were positive or potential before being eliminated. This is a common problem if the location is eliminated the same day as it's identified. In my correspondence with Marco:
"I'm starting to understand what Harold wants, and I think we're converging on the same solution. To make sure we're on the same page, I see the following problem:
My initial thought is that what we need here is to track the time of the day that a location was identified positive, potential, and time of day it was eliminated. Doing so will help us because:
This changes our time series of locations slightly: suppose a location was identified POSITIVE on day T. There are three scenarios:
The solution is to introduce a time element to
location_statuses
thatidentification_type
identificated_at
eliminated_at
We should also deprecate the
status
column as the "status" will now be calculated by comparingidentification_type
,identified_at
, andcleaned_at
.