cms-gem-daq-project / gem-plotting-tools

Repository for GEM commissioning plotting tools
GNU General Public License v3.0
1 stars 26 forks source link

Feature: time history analyzer #112

Closed lmoureaux closed 6 years ago

lmoureaux commented 6 years ago

Description

Add a new macro, plotTimeSeries.py, as requested in #88.

The macro works in three steps:

Bad scan removal

Scans that pass any the following cuts are removed:

Range detection

The time evolution of each channel is searched for successive scans with consistent behavior. A set of such scans "bad" scans for a given channel is called a (time) range; the definition of bad is user defined (see below).

Range finding starts with a list of scans, where each scan is marked as "good" or "bad". The definition of "bad" depends on what's being searched for (and "good" is always defined as "not bad"). The start of a range is determined by:

Then the range continues and the end of the range is determined by 5 consecutive good scans appearing (option: --numEndScans). To prevent the printing of spurious ranges due to transient effects ranges with less than 4 "bad" scans in total are suppressed (option: --minBadScans). A "range" found by this algorithm can have include some "good" scans.

As a side-effect, channels with sparse "bad" behavior, such as 88 in GEMINIm30L2, are also extracted. This can be controlled by tightening the cuts in the algorithm above.

Three definitions of "bad" are currently available:

This is somewhat rough and not exactly what asked for in #88, but it finds problematic ranges correctly, including some with sparse bad scans, see eg channels 88-89 below.

Analysis

For every "range" found in each of the VFATs, the following properties are computed and printed:

Column header Meaning
ROBstr or vfatCH Strip number and VFAT channel, respectively
Last known good Date and time of the last good scan before the range ("never" if the range starts at the first scan)
Range begins Start date and time
Range ends End date and time ("never" if the range includes the lastest scan)
#scans Total number of scans (good and bad)
masked% Percentage of #scans where the channel is "masked" not to be confused with "bad (useful to investigate channels that behave badly once in a while)
Initial maskReason maskReason for the first scan in the range
Other subsequent maskReasons maskReason not present for the first scan but found in a later scan in the same range

A summary table of initial maskReason vs VFAT is also printed at the end.

Types of changes

Motivation and Context

See issue #88

How Has This Been Tested?

Initially tested on GEMINIm30L2, with all S-curves taken before 1st of June. Ranges found matched existing features in data:

Now testing on GEMINIm29L1 data provided by @bdorney and compared to slides 22-31 here. The following configuration was used:

Full output . Conclusions:

Excerpt from the output:


VFAT 23

ROBstr Last known good Range begins Range ends #scans Masked% Initial maskReason Other subsequent maskReasons
18 2017.10.11.11.24 2017.10.13.12.53 never 127 100 HotChannel,FitFailed
19 2017.10.11.11.24 2017.10.13.12.53 never 127 0 DeadChannel
20 2017.10.11.11.24 2017.10.13.12.53 never 127 0 DeadChannel
31 2017.10.11.11.24 2017.10.13.12.53 never 127 0 DeadChannel
96 2017.06.15.15.10 2017.06.16.14.35 never 156 58 HotChannel,HighNoise DeadChannel

Initial maskReason summary

The table below shows the distribution of the initial maskReason for ranges found in each VFAT. Note that a single range is counted as many times as it has maskReasons.

HotChannel FitFailed DeadChannel HighNoise HighEffPed
0 0 0 2 0 0
1 0 0 1 0 0
2 3 3 0 0 0
3 2 1 0 1 0
4 0 0 0 0 0
5 3 5 13 0 0
6 2 34 0 3 0
7 25 4 26 14 1
8 1 1 9 0 0
9 2 2 1 0 0
10 5 5 2 0 0
11 4 4 0 0 0
12 0 0 0 0 0
13 2 2 0 0 0
14 0 0 0 0 0
15 9 4 15 0 0
16 2 3 2 0 0
17 4 2 0 0 0
18 2 2 0 0 0
19 6 4 2 0 0
20 0 0 0 0 0
21 3 3 0 0 0
22 6 3 0 0 0
23 2 1 3 1 0

Checklist:

lmoureaux commented 6 years ago

The code is now ready for initial review, but please check the TODO items in the description.

bdorney commented 6 years ago

Bad scan removal

Scans that pass any the following cuts are removed:

  • No channel is masked
  • The fraction of masked channels is above 7% [TODO: add option]

Please add clarification here. It is not clear what you are removing or why. e.g. why would you remove a scan that has no masked channels? This would ideally represent a perfect detector. The fraction of masked channels here is supposed to serve as what? Why is it not being considered?

Detection of masked ranges

The code finds ranges of masked channels in time. A range meets the following criteria:

  • Starts with a "bad" scan
  • The channel wasn't "bad" in the previous scan
  • Has at least 4 "bad" scans [TODO: add option]
  • At most 5 not-"bad" scans between to "bad" scans [TODO: add option]

What is meant by a range?

Three definitions of "bad" are currently available:

  • mask: the channel under consideration is masked
  • maskReason: the channel under consideration has a non-zero maskReason
  • noise: the channel under consideration has a low noise (< 0.05 fC)

What is this noise criterion supposed to be identifying? Why introduce a new term that is not present in maskReason? Additionally it is partially in the charge range of a channel with zero input capacitance: [4.14E-02, 1.09E-01] fC which depending on how it is used could cause issues in identifying them. Please provide additional details above.

This is somewhat rough and not exactly what asked for in #88, but it finds problematic ranges correctly, including some with sparse bad scans, see eg channels 88-89 below.

If you are trying to address #88 it would expected that what is asked for in #88 be addressed.

Tested on GEMINIm30L2, with all S-curves taken before 1st of June. Ranges found match existing features in data.

What do you mean Ranges found match existing features in data? What are ranges here? And what is being searched for?

Channel Known good Range begins Range ends #scans Masked% Initial maskReason Other subsequent maskReasons
88 2017.04.07.15.46 2017.04.09.15.16 2017.05.31.15.09 26 57 HotChannel,HighNoise

The column headers here are not described, what are you trying to illustrate here?

Initial maskReason summary

The table below shows the distribution of the initial maskReason for ranges found in each VFAT. Note that a single range is counted as many times as it has maskReasons.

This table is what is being requested in #88, e.g. initial maskReason was the first instance that was assigned? You mention you have not done what was requested in #88; it is hard to follow as a lack of description is included with this new tool.

Additionally in the table for this spot can channels have multiple initial maskReasons? If not why not? For example I see VFAT3 has 7 channels marked as HotChannel, FitFailed, and DeadChannel. Are these the same 7 dead channels in each case?

Add useless 'pass' everywhere … They're useless but match the style of the existing codebase

Be professional in your commit messages; you may find them "useless" but they improve readability.

bdorney commented 6 years ago

There should also be a corresponding update to README.md that gives a description of input arguments, outputs, and usage case examples. And that seems to be what's missing here in this description. If an expert cannot understand this easily; a USER will not be able too.

lmoureaux commented 6 years ago

Quick answers to select initial comments on the PR description

What is meant by a range?

A set of successive scans for a given VFAT and channel. I'm not happy with the wording either; please provide a better, preferably short, name if can you have one.

What is this noise criterion supposed to be identifying? Why introduce a new term that is not present in maskReason?

That's supposed to be what was requested in #88 -- or am I mistaken on the purpose of the noise cuts? If there's not enough detail for me to understand what's asked for, please update the request. It's in particular relevant because of this sentence in #88:

Caveats is that this needs to be modified to not be sensitive to transient effects

So the algo needs to be modified, which is what I'm trying to achieve here. But it's hard not to break it if I don't understand what it's meant to do.

This table is what is being requested in #88, e.g. initial maskReason was the first instance that was assigned?

What actually happens is more complicated that channels dying over time (even though the end result is the same). They can appear as dead for some time, be recovered, and then die for good. Such cases exist. Every detected "good->bad" (1) transition is recorded in the table.

(1) What exactly "good" and "bad" mean here depends on the value passed to the --ranges option.

Additionally in the table for this spot can channels have multiple initial maskReasons? If not why not?

When a transition happens because of several maskReasons, it gets an entry in each. It's not clear from the context info in #88 what information you want to extract from this table.

Channels with multiple maskReasons appear in multiple bins.

For example I see VFAT3 has 7 channels marked as HotChannel, FitFailed, and DeadChannel. Are these the same 7 dead channels in each case?

To answer this question one should look at the per-VFAT lists. In this case there are 7 DeadChannel and 7 HotChannel,FitFailed. For info, all of them were already dead at the time of the first scan and none was recovered afterwards.

you may find them "useless" but they improve readability.

It's a matter of taste. IMHO they add unnecessary visual clutter and in most cases they actually reduce readability (the only exception being deeply nested loops, which are to be avoided anyway).

lmoureaux commented 6 years ago

Expanded the documentation in the first post; it will eventually be included in the repo

lmoureaux commented 6 years ago

Link to the documentation