Calibration discussion - Githubissues

benkrikler commented 10 years ago

Phill and I briefly talked about this during the discussion written up on #44, but I wanted to tie down our ideas as it's obviously important to the analysis and I want to start implementing things.

The point is, the set of calibration constants needed to produce an energy measurement depends not only on the detector, but also on the reconstruction algorithm. Once a generator is mature enough, we will want to run it over some set of calibration data (test pulse, radioactive sources etc) and then use the outputs of this to form a set of constants that will be used to convert a particular property of a TAP (integral, amplitude or otherwise) into an energy.

Given the variety of this task, how do we envisage performing it and how can we store the values in a consistent way? I don't think I have enough experience to know what it really requires.

One thought is that each TAP generator has a corresponding callibrator. We give it the energies we expect for a given calibration data-set and it uses this to produce the calibration constants in some archival format that we can read back in within the generators on subsequent runs. Does something like that sound reasonable?

jrquirk commented 10 years ago

This seems reasonable. We also had the SQL database discussion, could it fit in there?

benkrikler commented 10 years ago

That's a good point, it would make sense to store and retrieve the outputs there. We need the interface for that written up though, and I'm not sure what the progress has been so far.

benkrikler commented 10 years ago

In issue #70, we have agreed to keep an energy field in the base class for a TAnalysedPulse. This implies that the TAP generators know about the calibration of a TAP.

The current flow for calibration then looks something like:

Get a generator working
Run TPIs from calibration data-sets through generator (making TAPs)
Run TAPs through the corresponding calibrator
Study resultant histograms etc and produce calibration data
Calibration constants stored in some run-time accessible format (SQL, text files etc)
Run TPIs from physics dataset through generator which accesses calibration data and now uses it to fill the TAPs with real, physical values.

That seems manageable to me, but I have a feeling that it might be better to split step 6 in two:

Run TPIs from physics dataset through generator (makes TAPs)
Another module applies a calibration dataset to the TAPs (possibly making CalibratedTAPs or just filling in extra fields).

The reason I prefer this is because now steps 2 and 6 are the same, only the input data has changed. That means fewer flags to set internally (are we doing calibration etc) as well as keeping each generator ignorant of the calibration database (however it's implemented). By splitting the 'apply calibration' stage into step 7, in a separate module, we're sticking more to the 'one class - one function' rule.

So that's where I'm at. I prefer the second option, but I want a bit of a reality check. Are we going too far with this and should we just be getting things done whichever way at this point? Thoughts anyone?

AndrewEdmonds11 commented 10 years ago

should we just be getting things done whichever way at this point?

Certainly, this is something I'm constantly worrying about (but then I'm getting closer and closer to having to finish...). But, for this, having an extra module is no more work right? We still have to write the code it's just whether it's in the generator or this module.

Another question, will each generator have it's own calibration module or will it be different parameters to the same one?

thnam commented 10 years ago

I think that the calibration was done poorly, we don't have much of information. Not sure if a fully automatic generator is worth it.

benkrikler commented 10 years ago

having an extra module is no more work right? We still have to write the code it's just whether it's in the generator or this module.

I suppose you're right on that one. We need to do the work anyway, so we might as well do it in a separate module at this point. In step 7 above though, I think I'll just fill in otherwise blank fields in the TAPs rather than write a specialised Calibrated TAP class.

Another question, will each generator have it's own calibration module or will it be different parameters to the same one?

I think in general (though this could be my inexperience showing) it's not so simple as to just give different parameters. To start with one generator might try to use a pulse's amplitude to get the energy, another might use the integral. Also, I'm not sure if we can always assume linearity of the response from a generator or from the detector.

In reality though we may need to assume linearity, if we don't have enough calibration data. And if only two possible candidate properties will be uses to determine the energy for a pulse (ie. only either amplitude or integral), one single module is probably all we would need. For each TAP source we would then just need the calibration database ( or text file ) to have 3 values:

Whether to use amplitude or the integral
Intercept of calibration fit
Gradient of calibration fit

@thnam ( @litchfld and @AndrewEdmonds11, I suppose) how much Calibration data do we have then? If it's really not much, a single calibration results file might be good enough rather than the effort of setting up the sql interface for this.

thnam commented 10 years ago

@benkrikler We have several runs with the sources for Si and Ge, and some 20-30 runs in momentum scanning process. I am examining those runs trying to understand detectors' response.

benkrikler commented 10 years ago

Do we not have any test pulse data then? Is it meaningful to try and take this now somewhere?

thnam commented 10 years ago

@benkrikler Oh, yes there are several of them, I completely forgot. I made about 10 runs, have to check again what they are for.

litchfld commented 10 years ago

@thnam ( @litchfld and @AndrewEdmonds11, I suppose) how much Calibration data do we have then? If it's really not much, a single calibration results file might be good enough rather than the effort of setting up the sql interface for this.

Yes I'd have though a text file is probably sufficient. SQLite would be neat, but I think you have to rebuild ROOT with all the right options to be able to use it.

litchfld commented 10 years ago

In issue #70, we have agreed to keep an energy field in the base class for a TAnalysedPulse. This implies that the TAP generators know about the calibration of a TAP.

Not necessarily. It's still reasonable for them to store their best estimate of the "energy" in whatever unit makes most sense to them. It requires a bit more intelligence from the TAP consumer though, but intelligence is probably going to be needed anyway (not grandmothers, etc, etc).

I don't think there's any absolute need for energy to be filled in calibrated units (BTW what are the calibrated units? Mev/um? MeV? Multiplies of MIPs?) until we get to detector pulses where you need to reconcile fast and slow channels anyway. So we could defer this to the detector pulse. But OTOH if you want to calibrate both the fast and slow pulses you might as well store the result.

benkrikler commented 10 years ago

@litchfld, I completely agree with everything you're saying there. But putting my more goal-oriented hat on, could you make more of a recommendation out of it?

litchfld commented 10 years ago

As for implementation, I think a free floating calibration class is probably the best thing. On instantiation it ~~opens~~ connects to the relevant file, then can answer queries like:

Calibrator Cali;
// or
Calibrator Cali(int run); // if we have more than one calibration period
// later that day
Cali.CalibratedCharge(IDs::Source source, double raw);

The free-floating class should be light: It can have a static pointer to a DB-like class which presents the DB-file in a meaningful way, and it may need one data member to keep track of which calibration period it is associated with.

Then it doesn't matter much where the calibration is actually done, it should only be about 2 lines of code for the user.

benkrikler commented 10 years ago

I think I disagree there Phill. Writing it as a Module gives it all the framework structure we've built up, including access to the modules file for configuration. It would also define where, and when the calibration gets done as well as implementing how.

Of course, there's the middle ground implementation where we call a free standing class by a dedicated module, but then I'm not sure i see that class being called anywhere else in practice.

litchfld commented 10 years ago

Sure, you can put the code that chunks over and saves calibrated calibration it in it's own module, in the TDetectorPulse module or in the TAnalysedPulse module. But wrapping it in that kind of cheap-to-instantiate object just means that you can defer[revisit] that decision until[when] we have some better understanding of what the calibrations will looks like.

benkrikler commented 10 years ago

Ok, so I want to get this going again. I'd like to start implementing things, but before I do anything I need a better sense of how we'll do it. So mostly @thnam, since you've been through this process once already:

Do we have a list of calibration runs available somewhere (elog / dropbox)? Does it / could it contain the detectors that were studied in each calibration run and what we should expect to see. For example, if the calibration was done using a radioactive source, we'd need the energy of the key peaks, or for test pulse calibration data we'd want the equivalent charge deposited I'd guess.
Have you done anything more complicated than transform the x-axis of an amplitude distribution with an offset and an ADC to energy scale factor? In other words, are we just going to want to build up a distribution of the uncalibrated energy for each pulse which we then stretch and shift until it matches the expected energy distribution, or will we need something more elaborate?
Will we need to provide for different calibration constants depending on the run number or period? I'm not sure if we need this, though we changed things around so much I think we probably do, but on the other hand, I'm not sure if we can do this if we don't have enough calibration data.

alcap-org / AlcapDAQ

Calibration discussion #60