Another topic discussed with Matt Watson (@sporty81) on e-mail was generating histograms, which are essentially just another way to visualize the CDF approximated by the t-digest. I've hacked together two simple SQL functions in pull request #5 which calculate equi-width and equi-height histograms (there are probably more types, but I think those are the most common).
I think the functions are mostly fine, but I wonder how accurate the histograms can be. Probably good enough on the tails, but t-digests are intentionally constructed with lower accuracy in the middle part so the histograms have the same issue.
Another topic discussed with Matt Watson (@sporty81) on e-mail was generating histograms, which are essentially just another way to visualize the CDF approximated by the t-digest. I've hacked together two simple SQL functions in pull request #5 which calculate equi-width and equi-height histograms (there are probably more types, but I think those are the most common).
I think the functions are mostly fine, but I wonder how accurate the histograms can be. Probably good enough on the tails, but t-digests are intentionally constructed with lower accuracy in the middle part so the histograms have the same issue.