go-hep / hep

hep is the mono repository holding all of go-hep.org/x/hep packages and tools
https://go-hep.org
BSD 3-Clause "New" or "Revised" License
230 stars 35 forks source link

hplot: adding a generic binned graph type? #755

Open rmadar opened 4 years ago

rmadar commented 4 years ago

It is relatively usual to compute bin-by-bin quantities from various histograms, and the result is not always an histogram, i.e. (weighted) events counts. The most typical example is a bin-by-bin ratio of histograms, but there are more (bin-by-bin composition, bin-by-bin significance, etc ...).

The most natural option - I think - would be to write a new generic plotter, let's say BinnedGraph, to deal with binned information. This would mainly be a copy paste of hplot.H1D except that the Hist field would be replaced by another type (and I think all the plotting aspects are similar):

type BinnedQuantity struct {
  binning hbook.Range      // I am not sure which is the best type.
  Values  plotter.Values   // y values
  Errors  plotter.YErrors  // y errros
}

Such a new type could have arbitrary Values and Errors and would allow store any operation between several histograms, via syntax like:

bg := NewBinnedGraph(h.Binning)
for i:=0 ; i<h.Len() ; i++ {
  v1, e1 := h1.Value(i), h1.Error(i)
  v2, e2 := h2.Value(i), h2.Error(i)

  bg.SetValue(i, val(v1, v2))
  bg.SetError(i, err(v1, e1, v2, e2))
}
rmadar commented 4 years ago

Putting back the discussion from https://github.com/go-hep/hep/pull/756 here

I must say I am a bit hard-pressed to see exactly the difference w/, say, a hplot.S2D. but it wouldn't be the first time something escaped me :)

I think the practical difference if not that large: the only real one is the ability of stacking different BinnedGraph while it requires some "tricks" in hplot.S2D (you need to either have the sames x's or interpolate between x's). This would be needed to produce the (in)famous composition plots (randomly chosen example ;-) ).

From a conceptual point of view, one has discrete values as x's while the other has intervals (which basically share everything with usual histograms - but the relation between val=sum(w) and err = sum(w^2)).

At the end, I am not sure what's the best between:

I hope that's clearer !

rmadar commented 4 years ago

I am not 100% sure the introduction of Count type close this issue. Indeed, right now, bin with arbitrary contents and errors can only be plotted via BinnedErrorBand. For example, one cannot use arbitrary bin contents without error band, or y-error bars.

Ideally, one would need to have the entire plot machinery working in the same way for either []hbook.Count or hbook.H1D.

Does that make sense?

sbinet commented 4 years ago

right. I was a bit overenthusiast.