Closed lmoureaux closed 6 years ago
The code is now ready for initial review, but please check the TODO items in the description.
Bad scan removal
Scans that pass any the following cuts are removed:
- No channel is masked
- The fraction of masked channels is above 7% [TODO: add option]
Please add clarification here. It is not clear what you are removing or why. e.g. why would you remove a scan that has no masked channels? This would ideally represent a perfect detector. The fraction of masked channels here is supposed to serve as what? Why is it not being considered?
Detection of masked ranges
The code finds ranges of masked channels in time. A range meets the following criteria:
- Starts with a "bad" scan
- The channel wasn't "bad" in the previous scan
- Has at least 4 "bad" scans [TODO: add option]
- At most 5 not-"bad" scans between to "bad" scans [TODO: add option]
What is meant by a range
?
Three definitions of "bad" are currently available:
mask
: the channel under consideration is maskedmaskReason
: the channel under consideration has a non-zeromaskReason
noise
: the channel under consideration has a low noise (< 0.05 fC)
What is this noise
criterion supposed to be identifying? Why introduce a new term that is not present in maskReason
? Additionally it is partially in the charge range of a channel with zero input capacitance: [4.14E-02, 1.09E-01] fC which depending on how it is used could cause issues in identifying them. Please provide additional details above.
This is somewhat rough and not exactly what asked for in #88, but it finds problematic ranges correctly, including some with sparse bad scans, see eg channels 88-89 below.
If you are trying to address #88 it would expected that what is asked for in #88 be addressed.
Tested on GEMINIm30L2, with all S-curves taken before 1st of June. Ranges found match existing features in data.
What do you mean Ranges found match existing features in data
? What are ranges
here? And what is being searched for?
Channel Known good Range begins Range ends #scans Masked% Initial maskReason
Other subsequent maskReason
s88 2017.04.07.15.46 2017.04.09.15.16 2017.05.31.15.09 26 57 HotChannel,HighNoise
The column headers here are not described, what are you trying to illustrate here?
Initial maskReason summary
The table below shows the distribution of the initial maskReason for ranges found in each VFAT. Note that a single range is counted as many times as it has maskReasons.
This table is what is being requested in #88, e.g. initial maskReason
was the first instance that was assigned? You mention you have not done what was requested in #88; it is hard to follow as a lack of description is included with this new tool.
Additionally in the table for this spot can channels have multiple initial maskReasons
? If not why not? For example I see VFAT3 has 7 channels marked as HotChannel
, FitFailed
, and DeadChannel
. Are these the same 7 dead channels in each case?
Add useless 'pass' everywhere … They're useless but match the style of the existing codebase
Be professional in your commit messages; you may find them "useless" but they improve readability.
There should also be a corresponding update to README.md
that gives a description of input arguments, outputs, and usage case examples. And that seems to be what's missing here in this description. If an expert cannot understand this easily; a USER
will not be able too.
What is meant by a range?
A set of successive scans for a given VFAT and channel. I'm not happy with the wording either; please provide a better, preferably short, name if can you have one.
What is this noise criterion supposed to be identifying? Why introduce a new term that is not present in maskReason?
That's supposed to be what was requested in #88 -- or am I mistaken on the purpose of the noise cuts? If there's not enough detail for me to understand what's asked for, please update the request. It's in particular relevant because of this sentence in #88:
Caveats is that this needs to be modified to not be sensitive to transient effects
So the algo needs to be modified, which is what I'm trying to achieve here. But it's hard not to break it if I don't understand what it's meant to do.
This table is what is being requested in #88, e.g. initial maskReason was the first instance that was assigned?
What actually happens is more complicated that channels dying over time (even though the end result is the same). They can appear as dead for some time, be recovered, and then die for good. Such cases exist. Every detected "good->bad" (1) transition is recorded in the table.
(1) What exactly "good" and "bad" mean here depends on the value passed to the --ranges
option.
Additionally in the table for this spot can channels have multiple initial maskReasons? If not why not?
When a transition happens because of several maskReason
s, it gets an entry in each. It's not clear from the context info in #88 what information you want to extract from this table.
Channels with multiple maskReasons appear in multiple bins.
For example I see VFAT3 has 7 channels marked as HotChannel, FitFailed, and DeadChannel. Are these the same 7 dead channels in each case?
To answer this question one should look at the per-VFAT lists. In this case there are 7 DeadChannel and 7 HotChannel,FitFailed. For info, all of them were already dead at the time of the first scan and none was recovered afterwards.
you may find them "useless" but they improve readability.
It's a matter of taste. IMHO they add unnecessary visual clutter and in most cases they actually reduce readability (the only exception being deeply nested loops, which are to be avoided anyway).
Expanded the documentation in the first post; it will eventually be included in the repo
Description
Add a new macro,
plotTimeSeries.py
, as requested in #88.The macro works in three steps:
Bad scan removal
Scans that pass any the following cuts are removed:
--minScanAvgNoise
). This cuts scans with none or very few channels responding. Here's the distribution of noise for all scans of GEMINIm30L2: 0.1 was chosen as default instead of, say, 0.22 because other detectors may have lower noise, and maybe we'll manage to lower the noise. It's not a problem if some "bad" scans remain as long as most of them are removed.--maxScanMaskedFrac
)Range detection
The time evolution of each channel is searched for successive scans with consistent behavior. A set of such scans "bad" scans for a given channel is called a (time) range; the definition of bad is user defined (see below).
Range finding starts with a list of scans, where each scan is marked as "good" or "bad". The definition of "bad" depends on what's being searched for (and "good" is always defined as "not bad"). The start of a range is determined by:
Then the range continues and the end of the range is determined by 5 consecutive good scans appearing (option:
--numEndScans
). To prevent the printing of spurious ranges due to transient effects ranges with less than 4 "bad" scans in total are suppressed (option:--minBadScans
). A "range" found by this algorithm can have include some "good" scans.As a side-effect, channels with sparse "bad" behavior, such as 88 in GEMINIm30L2, are also extracted. This can be controlled by tightening the cuts in the algorithm above.
Three definitions of "bad" are currently available:
mask
: the channel under consideration is maskedmaskReason
: the channel under consideration has a non-zeromaskReason
zeroInputCap
: the channel under consideration has an scurve width that is consistent with zero input capacitance (4.14E-02 < scurevWidth < 1.09E-01 fC
). The precise values can be controlled using the--minNoise
and--maxNoise
options.This is somewhat rough and not exactly what asked for in #88, but it finds problematic ranges correctly, including some with sparse bad scans, see eg channels 88-89 below.
Analysis
For every "range" found in each of the VFATs, the following properties are computed and printed:
ROBstr
orvfatCH
#scans
where the channel is "masked" not to be confused with "bad (useful to investigate channels that behave badly once in a while)maskReason
maskReason
for the first scan in the rangemaskReason
smaskReason
not present for the first scan but found in a later scan in the same rangeA summary table of initial
maskReason
vs VFAT is also printed at the end.Types of changes
Motivation and Context
See issue #88
How Has This Been Tested?
Initially tested on GEMINIm30L2, with all S-curves taken before 1st of June. Ranges found matched existing features in data:
mask
definition: they correspond to consecutive scans with the corresponding channel maskedmaskReason
definition: they correspond to consecutive scans where the corresponding channel has non-zeromaskReason
noise
definition: they correspond to consecutive scans with the corresponding channel blank in the plot (ie with noise below 0.05 fC)Now testing on GEMINIm29L1 data provided by @bdorney and compared to slides 22-31 here. The following configuration was used:
maskReason
ranges--onlyCurrent
to select ranges that span until the latest scanDeadChannel
in either initial or latermaskReason
sFull output . Conclusions:
DeadChannel
straight awayHotChannel
haveDeadChannel
in the additionalmaskReason
s. This is also observed in GEMINIm30L2. Reporting because it's not observed for other initialmaskReason
s.Excerpt from the output:
VFAT 23
ROBstr
maskReason
maskReason
sInitial maskReason summary
The table below shows the distribution of the initial maskReason for ranges found in each VFAT. Note that a single range is counted as many times as it has maskReasons.
Checklist: