kmcurry / property-buddy

Discover useful data about a location
https://www.spotfax.us
GNU General Public License v3.0
3 stars 2 forks source link

Discard outliers in data, ex. response time of Null, 0, or very large number #32

Open kmcurry opened 7 years ago

kmcurry commented 7 years ago

At national Day of civic hacking 2017 a city of Virginia Beach is data scientist told me that average response time is a bad indicator due to situational factors. For example, in an emergency situation and officer might have to choose between logging the arrival time and responding.

jamjohns commented 6 years ago

I'm assuming he meant the trimmed mean, or some form of standard deviation. Can you describe the formula used vs. what was proposed?

kmcurry commented 6 years ago

I'm just summing the values and dividing by number of values. Natasha Singh-Miller suggested a different method but I don't know what. We can ask her. We should filter outliers for sure.

kmcurry commented 6 years ago

I've just realized I've been using the wrong terms here. Current method is sum of values divided by number of values (average, mean). We need a better method that discards outliers such as extremely long response times and possibly instantaneous (0) response times.

jamjohns commented 6 years ago

Yeah that's what I thought you meant. I wasn't sure if you had a specific method in mind, or just something better than average.