fact-project / read_mars

Python library to read MARS output (e.g. ganymed or star files)
MIT License
0 stars 0 forks source link

add StatusDisplay class, that can open a root, which contains a single #6

Closed dneise closed 7 years ago

dneise commented 7 years ago

add StatusDisplay class:

This can open a root file, which contains a single MStatusDisplay and give access to the contents of that StatusDisplay.

dneise commented 7 years ago

But still, this ipynb might be interesting https://github.com/fact-project/read_mars/blob/4d79a3667ba51e34a7ba2f978ee35191cf9e555d/read_mars/tests/resources/Untitled.ipynb

dneise commented 7 years ago

I am not quite happy with the keys to the contents member.

have a look at this list of keys. These keys are tuples of strings of this form:

('class_name', 'object_name', 'maybe_pad_name', 'canvas_name')

In my opinion the most important designators by which I think about the objects are:

So I'd like to say something like: I want the TH1F with the name "Gain" .. in case there are multiple things of name Gain, I'd like to specify the kind of thing. in this case "TH1F". However if there is only one thing called "Gain", which should I bother to remember it's type?

Anyway .. let's have a look at the complete list:

[('TH1F', 'Sum1', 'SumHist'),
 ('TF1', 'spektrum', 'SumHist'),
 ('TH1F', 'SumC1', 'SumHist'),
 ('TH1D', 'Pix0', 'Pix0'),
 ('TF1', 'spektrum', 'Pix0'),
 ('TH1D', 'Pix5', 'Pix5'),
 ('TF1', 'spektrum', 'Pix5'),
 ('MHCamera', 'Rate', 'Cams1_1', 'Cams1'),
 ('MHCamera', 'Gain', 'Cams1_2', 'Cams1'),
 ('MHCamera', 'Baseline', 'Cams1_3', 'Cams1'),
 ('MHCamera', 'RelSigma', 'Cams1_4', 'Cams1'),
 ('MHCamera', 'Crosstalk', 'Cams1_5', 'Cams1'),
 ('MHCamera', 'Noise', 'Cams1_6', 'Cams1'),
 ('MHCamera', 'FitProb', 'Cams2_1', 'Cams2'),
 ('MHCamera', 'Chi2', 'Cams2_2', 'Cams2'),
 ('MHCamera', 'CoeffR', 'Cams2_4', 'Cams2'),
 ('MHCamera', 'Pxtalk', 'Cams2_5', 'Cams2'),
 ('TH1F', 'Rate1', 'Hists1_1', 'Hists1'),
 ('TH1F', 'Rate2', 'Hists1_1', 'Hists1'),
 ('TH1F', 'Gain1', 'Hists1_2', 'Hists1'),
 ('TH1F', 'Gain2', 'Hists1_2', 'Hists1'),
 ('TH1F', 'Baseline1', 'Hists1_3', 'Hists1'),
 ('TH1F', 'Baseline2', 'Hists1_3', 'Hists1'),
 ('TH1F', 'RelSigma1', 'Hists1_4', 'Hists1'),
 ('TH1F', 'RelSigma2', 'Hists1_4', 'Hists1'),
 ('TH1F', 'Crosstalk1', 'Hists1_5', 'Hists1'),
 ('TH1F', 'Crosstalk2', 'Hists1_5', 'Hists1'),
 ('TH1F', 'Noise1', 'Hists1_6', 'Hists1'),
 ('TH1F', 'Noise2', 'Hists1_6', 'Hists1'),
 ('TH1F', 'FitProb1', 'Hists2_1', 'Hists2'),
 ('TH1F', 'FitProb2', 'Hists2_1', 'Hists2'),
 ('TH1F', 'ChiSq1', 'Hists2_2', 'Hists2'),
 ('TH1F', 'ChiSq2', 'Hists2_2', 'Hists2'),
 ('TH1F', 'CoeffR1', 'Hists2_4', 'Hists2'),
 ('TH1F', 'CoeffR2', 'Hists2_4', 'Hists2'),
 ('TH1F', 'Pxtalk1', 'Hists2_5', 'Hists2'),
 ('TH1F', 'Pxtalk2', 'Hists2_5', 'Hists2'),
 ('MHCamera', 'NormGain', 'NormGain_1', 'NormGain'),
 ('TH1F', 'NormGain1', 'NormGain_2', 'NormGain'),
 ('TH1F', 'NormGain2', 'NormGain_2', 'NormGain'),
 ('TH1F', 'SumC1', 'CleanHist1'),
 ('TF1', 'spektrum', 'CleanHist1'),
 ('TH1F', 'SumC2', 'CleanHist2'),
 ('TF1', 'spektrum', 'CleanHist2'),
 ('TH1F', 'SumScale1', 'GainHist1'),
 ('TF1', 'spektrum', 'GainHist1'),
 ('TH1F', 'SumScale2', 'GainHist2'),
 ('TF1', 'spektrum', 'GainHist2'),
 ('TH1F', 'Time', 'ArrTime'),
 ('TH1F', 'Puls', 'Pulse')]
dneise commented 7 years ago

And for another kind of file:

[('TH2F', 'Baseline', 'MHBaseline'),
 ('MHCamEvent', 'MHCamEvent', 'Baseline'),
 ('MHCamera', '', 'Baseline_1', 'Baseline'),
 ('TH1D', 'proj', 'Baseline_2', 'Baseline'),
 ('MHCamera', '', 'Baseline_3', 'Baseline'),
 ('MHCamera', 'err', 'Baseline_4', 'Baseline'),
 ('TProfile', 'rad', 'Baseline_5', 'Baseline'),
 ('TProfile', 'az', 'Baseline_6', 'Baseline'),
 ('TH2F', 'Signal', 'MHSingles_1', 'MHSingles'),
 ('TH2F', 'Time', 'MHSingles_2', 'MHSingles'),
 ('TProfile2D', 'Pulse', 'MHSingles_3', 'MHSingles'),
 ('TH1D', 'Time_py', 'Time'),
 ('TH1D', 'Pulse_py', 'Pulse')]
dneise commented 7 years ago

Now .. don't get me wrong. I find it great, that I can look at this list and get a complete overview about the contents of this file, without the need for an X-connection to where the file is.

But it's cumbersome to type these tuples. Maybe some kind of select statement, like:

file.select(type='TH1', name='signal')

which returns lists of contents that comply to my selection and if I'm lucky, all I need to say is: f.select(name='Gain') but ...

Is that a useful interface?

jebuss commented 7 years ago

my 50cents: So far we were trying to extract certain plots from a MarsStatusDisplay because we knew they were there and we knew which information we could gain from them. With this, lets say, new approach, we are able to extract whatever there maybe in a root file with statusDisplay, which i really like. However, by doing so we might loose a bit of context, since we do not visually see which plots belong together and what they could mean. I think the information from which tab a certain plot came delivers some information. Usually you could assume that plots in the same tab share some sort of context. Thus, I think keeping the tab information might sometimes (not always) help. The TPad information, in my opinion, seems pretty useless, because the only thing i learn e.g. from ('MHCamera', 'Baseline', 'Cams1_3', 'Cams1') is, Baseline is the third plot in the Cams1 canvas. But when looking at e.g.

('TH1F', 'Rate1', 'Hists1_1', 'Hists1'),
('TH1F', 'Rate2', 'Hists1_1', 'Hists1'),

i learn there are two TH1F plots with almost the same name in the same tab. Damn it, they are not useless! However, I had to open Mars to figure out that Rate2 is Model fit to Rate1, not sure how to tackle this.

Bottom line: Keeping the canvas name makes sense to me. Pad name can possibly be reduced to pad number, because the canvas name part of the pad name is redundant. The select statement (with possible keys type, name, canvas, tab_nr) approach seems to me a good way. However it does not solve the two-plots-with-almost-the-same-name issue. But in this case you would see that they are from the same tab and could simply plot them to get insight.

dneise commented 7 years ago

Now it looks like this:

[StatusDisplayKey(name='Baseline', class_name='TH2F', canvas_name='MHBaseline', pad_number=None),
 StatusDisplayKey(name='MHCamEvent', class_name='MHCamEvent', canvas_name='Baseline', pad_number=None),
 StatusDisplayKey(name='', class_name='MHCamera', canvas_name='Baseline', pad_number=1),
 StatusDisplayKey(name='proj', class_name='TH1D', canvas_name='Baseline', pad_number=2),
 StatusDisplayKey(name='', class_name='MHCamera', canvas_name='Baseline', pad_number=3),
 StatusDisplayKey(name='err', class_name='MHCamera', canvas_name='Baseline', pad_number=4),
 StatusDisplayKey(name='rad', class_name='TProfile', canvas_name='Baseline', pad_number=5),
 StatusDisplayKey(name='az', class_name='TProfile', canvas_name='Baseline', pad_number=6),
 StatusDisplayKey(name='Signal', class_name='TH2F', canvas_name='MHSingles', pad_number=1),
 StatusDisplayKey(name='Time', class_name='TH2F', canvas_name='MHSingles', pad_number=2),
 StatusDisplayKey(name='Pulse', class_name='TProfile2D', canvas_name='MHSingles', pad_number=3),
 StatusDisplayKey(name='Time_py', class_name='TH1D', canvas_name='Time', pad_number=None),
 StatusDisplayKey(name='Pulse_py', class_name='TH1D', canvas_name='Pulse', pad_number=None),
]
dneise commented 7 years ago

Okay so here is another interface, I was playing around with.

I would call it the "DataFrame" interfacce, since I am internally (mis-) using a pandas.DataFrame to generate this.

Have a look at the pad_number... its converted to float, and missing pad_numbers turn up as NaN.... not sure we want that.

dneise commented 7 years ago

I think I am going too fast for myself here... this is somehow faking a Dataframe but not really ... and one does not know what to expect from this ... so I guess it's shit.

Let's say this is maybe an outlook. but at the moment .. I recommend to use the get() method where one needs to explicitely provide all the 4 parameters to get a certain object out of the file. In order to learn what the parameters are one can print the keys() and simply copy+paste the parameters from there into the get() call ...

that is still fairly good.