ukoethe / vigra

a generic C++ library for image analysis
http://ukoethe.github.io/vigra/
Other
410 stars 191 forks source link

Implement a histogram class #38

Open ukoethe opened 13 years ago

ukoethe commented 13 years ago

Besides the usual functionality, it should also support overlapping bins (i.e. sampling prefilters that are better than the simple box function of a naive histogram class).

hmeine commented 13 years ago

Oh, yeah, great!

Let's make more clear what one conceptually excepts from this:

Those are the minimum methods AFAICS. Now it becomes more interesting; maybe we need different (sub-)classes for the following:

Other stuff supported by the histogram class in MeVisLab, which is however ill-designed:

IMO, these do not belong into the histogram class itself, but maybe a histogram should allow random r/w access to its array data, which would allow to do these quite easily using helper functions.

hmeine commented 13 years ago

As I wrote in issue #45, I just discovered that the boost::accumulators library also contains a histogram functor. (boost::accumularots offers nice, well-designed, numerically stable functors for deriving various statistics in a flexible, yet efficient manner.) Anyhow, it is definitely worth a look, even if it does not seem to solve our API problem.

(I wonder if there is another histogram class in boost or similar libraries.)

The histogram features is hidden behind the term "density": http://www.boost.org/doc/libs/1_47_0/doc/html/boost/accumulators/tag/density.html

There has been some discussion in 2008 about a simpler way to specify min/max/binCount, but I am not sure if that was ever committed (I don't see it in the docs): http://lists.boost.org/Archives/boost/2008/01/132789.php + follow-ups

ukoethe commented 13 years ago

The boost density class is quite nice. However, there are two drawbacks:

hmeine commented 12 years ago

I just found a new requirement: It makes sense to offer functionality for increasing / decreasing the range of existing histograms (while retaining bin contents). I assume that we only want this without resampling, although I can picture other people even wanting the latter. Actually, if we assume that our histogram allows array-like access not only for reading, resampling could easily be done with external functions. The same goes for smoothing and the like, which is why I think an is-a-array approach would be promising.

Update: @ukoethe’s comment on "Estimating min and max from the first n data elements …" refers to boost::accumulators’ implementation of the histogram, density_impl, which is only suitable for IIR samples: "The positions and sizes of the bins are determined using a specifiable number of cached samples…"

ukoethe commented 11 years ago

Support for a major histogram use case has been implemented in the Feature Accumulator framework (histograms and quantiles over the intensities in labeled regions). Histograms over local windows as in a channel representation remains open (cf. issue #39).