nmalkin / kpi-dashboard

dashboard for visualizing key performance indicators for Mozilla Persona
7 stars 4 forks source link

Use mean instead of median for report #1 #29

Closed nmalkin closed 12 years ago

nmalkin commented 12 years ago

Report #1 is the median number of sites a user logs into with Persona.

As part of migrating to CouchDB as the backend (#27), finding the median of the data series becomes a significantly harder technical challenge. (To do it in a map/reduce framework requires a quick-select algorithm, which there doesn't seem to be a good way to do in CouchDB.)

Alternately, the median value for each day could be precalculated when data arrives and then stored in the database. However, this would require either a new database (cumbersome) or a change to the data format and code of the current one (very undesirable).

Calculating the mean of the dataset, however, is much easier.

While the median is a more sensible value to look at (it is less sensitive to outliers), it has been agreed, before, that this entire report is not hugely meaningful. The median value itself doesn't really say anything. The only way we'd use it is to watch the number and hope it trends up. In that case, however, the mean is just about as good: we can look at it and watch its trend.

Therefore, with @jedp, we have resolved to use the mean, instead of the median, for this report.

If anyone cares, we can discuss it here, and revisit this when there's more time.