Closed benkrikler closed 4 years ago
Merging #26 into master will not change coverage by
%
. The diff coverage isn/a
.
@@ Coverage Diff @@
## master #26 +/- ##
======================================
Coverage 0.00% 0.00%
======================================
Files 7 7
Lines 707 772 +65
======================================
- Misses 707 772 +65
The under/ overflow bins can now be disabled / enabled from the config file. Adding:
no_over_underflow: False
to the config will draw the bins, setting it to True will hide them (which is the default).
The latest push adds the functionality to give each dataset a colour directly. In the config file, add a section called:
dataset_colours:
# <dataset_name>: <colour_spec>
ttbar: red # use a named colour
wz: "#3355c5" # Use a Hex string
dy: [0.3, 0.4, 0.8] # Use a 3-tuple for RGB with numbers between 0 and 1
The colour specifications are directly handled by matplotlib, so they can be any of the methods described in https://matplotlib.org/3.1.0/tutorials/colors/colors.html.
Any datasets that are not given a colour will get their colour from the default colour map mechanism that has been used so far, this new method just allows us to override those colours.
The under/ overflow bins can now be disabled / enabled from the config file. Adding:
no_over_underflow: False
to the config will draw the bins, setting it to True will hide them (which is the default).
This option doesn't seem stable. If I run over the dataframe tbl_dataset.ht--ht.csv.zip (zipped since csv files aren't supported in comments) with no_over_underflow: False
, I get the following error message:
fast_plotter.plotting - INFO - Making 1D Projection: ht
fast_plotter.plotting - ERROR - Couldn't plot 1D projection: ht
Traceback (most recent call last):
File "/home/hep/ebhal/chip_software/src/fast-plotter/fast_plotter/plotting.py", line 56, in plot_all
figsize=figsize, **kwargs
File "/home/hep/ebhal/chip_software/src/fast-plotter/fast_plotter/plotting.py", line 335, in plot_1d_many
colourmap=colourmap, dataset_order=dataset_order)
File "/home/hep/ebhal/chip_software/src/fast-plotter/fast_plotter/plotting.py", line 182, in actually_plot
vals.apply(filler, axis=0, step="mid")
File "/home/hep/ebhal/miniconda3/envs/chip_env/lib/python3.7/site-packages/pandas/core/frame.py", line 6928, in apply
return op.get_result()
File "/home/hep/ebhal/miniconda3/envs/chip_env/lib/python3.7/site-packages/pandas/core/apply.py", line 186, in get_result
return self.apply_standard()
File "/home/hep/ebhal/miniconda3/envs/chip_env/lib/python3.7/site-packages/pandas/core/apply.py", line 292, in apply_standard
self.apply_series_generator()
File "/home/hep/ebhal/miniconda3/envs/chip_env/lib/python3.7/site-packages/pandas/core/apply.py", line 321, in apply_series_generator
results[i] = self.f(v)
File "/home/hep/ebhal/miniconda3/envs/chip_env/lib/python3.7/site-packages/pandas/core/apply.py", line 112, in f
return func(x, *args, **kwds)
File "/home/hep/ebhal/chip_software/src/fast-plotter/fast_plotter/plotting.py", line 148, in __call__
color=color, linewidth=width, where="mid", label=label, linestyle=style)
File "/home/hep/ebhal/chip_software/src/fast-plotter/fast_plotter/plotting.py", line 435, in draw
fill_val=fill_val, expected_xs=expected_xs)
File "/home/hep/ebhal/chip_software/src/fast-plotter/fast_plotter/plotting.py", line 218, in standardize_values
x, y_values = add_missing_vals(x, expected_xs, y_values=y_values, fill_val=fill_val)
File "/home/hep/ebhal/chip_software/src/fast-plotter/fast_plotter/plotting.py", line 256, in add_missing_vals
new[insert] = y
ValueError: ('NumPy boolean array indexing assignment cannot assign 20 input values to the 18 output values where the mask is true', 'occurred at index VH')
fast_plotter.plotting - ERROR - None
fast_plotter.plotting - ERROR - ('NumPy boolean array indexing assignment cannot assign 20 input values to the 18 output values where the mask is true', 'occurred at index VH')
The dataset "VH" doesn't exist in the dataframe but is in the dataset_order
list in my plotting config (since it is general, and not all datasets will have entries in every dataframe). This hasn't been a problem before, and when I remove the no_over_underflow: False
line from my config, it works fine
Thanks for letting me know. I'll take a look at the DFs tonight. Will also add the option to control the error calculation from the config.
This should now be fixed. The issue was really subtle: when we were replacing the inf
s of the under/overflow bins it was also modifying the list of values we expected to see (which is used to add in missing bins). The traceback you saw was an indirect consequence of this.
Last few changes:
no_over_underflow
option to be a show_over_underflow
option insteaderr_from_sumw2: True
to the config to enable using sum of squared weights as the variance, else variance will be given by (sum w)^2 / n.I think there's a bug when plotting over/underflow bins. Background processes plot fine but data does not plot. For example, the plot
contains the overflow bin [350.0, inf)
and is plotted for background but not for data. Checking the dataframe that was used to plot it
tbl_dataset.leadLepton_pt--lead_lepton_pt.csv.zip (zipped because I can't include .csv files inline) shows that there are overflow bins for data (SingleElectron*
) which contain events
Thanks for spotting that issue, it should now be solved in both the absolute yield plot and the ratio:
Several improvements:
Still to come: