Closed hsuominen closed 9 years ago
so you mean you make a histogram of the entire map and then omit the tails? how do you translate the histogram back to a heat map?
We have a discrete colorscale that maps easily to some 'smart bin' that we calculate. However you made a good point in that finding a smart bin size from the data (or an overbinned histogram) is non-trivial. One option would be to use a Gaussian-Mixture-Model to extract some effective bin width (effectively a fit of a sum of Gaussians..) - see http://www.astroml.org/book_figures/chapter4/fig_GMM_1D.html or http://scikit-learn.org/stable/modules/density.html buuut this is a bit too involved for what I was thinking (I tried the GMM, it works, but definitely not inline and in javascript...)
I don't get what you are suggesting actually :)
— Giulio Ungaretti
On Fri, Oct 17, 2014 at 2:38 AM, hsuominen notifications@github.com wrote:
We have a discrete colorscale that maps easily to some 'smart bin' that we calculate. However you made a good point in that finding a smart bin size from the data (or an overbinned histogram) is non-trivial. One option would be to use a Gaussian-Mixture-Model to extract some effective bin width (effectively a fit of a sum of Gaussians..) - see http://www.astroml.org/book_figures/chapter4/fig_GMM_1D.html or http://scikit-learn.org/stable/modules/density.html buuut this is a bit too involved for what I was thinking (I tried the GMM, it works, but definitely not inline and in javascript...)
Reply to this email directly or view it on GitHub: https://github.com/giulioungaretti/opendatahacklab2014/issues/11#issuecomment-59451832
Ok simplest method that encapsulates my idea is - compute 5 bins with constant area (variable width) and use the boundaries as the colorscale boundaries. For fun reading see http://pubs.research.avayalabs.com/pdfs/ALR-2007-003-paper.pdf
I will post a demo later.
but that’s pretty much a quantile, isn’t ?
— Giulio Ungaretti
On Fri, Oct 17, 2014 at 8:30 PM, hsuominen notifications@github.com wrote:
Ok simplest method that encapsulates my idea is - compute 5 bins with constant area (variable width) and use the boundaries as the colorscale boundaries. For fun reading see http://pubs.research.avayalabs.com/pdfs/ALR-2007-003-paper.pdf
I will post a demo later.
Reply to this email directly or view it on GitHub: https://github.com/giulioungaretti/opendatahacklab2014/issues/11#issuecomment-59555373
Closed with nice d3 quantile function which implements equal area bins
To make the data stand out more and look better I propose:
We make a quick histogram of the index score for some given slider configuration (bins are some given range of index scores). Then we select 5 bins that best represent the data and use that for the color map (colors can be the same, but ranges are changed dynamically).
Whatcha think?