pershint / DCMacs

Scripts for taking SNO+ zdabs, processing them, and outputting ROOT files with events filtered using data cleaning
0 stars 0 forks source link

need to have two different bitmasks #1

Open knapik opened 7 years ago

knapik commented 7 years ago

In config/config.py, there should be a processing mask option and an analysis mask option.

We need to be able to specify two different bitmasks for the various macros

"default_apply" should be used for the 2nd pass processing mac

"analysis_flag" or a list of cuts should be used to clean the data.

The new bitmasks will be in DATACLEANING.ratdb after https://github.com/snoplus/rat/pull/1645 gets merged. In the meantime, you can just update your local file.

pershint commented 7 years ago

This makes sense, I'll add in the different masks today.

I have a follow-up questio regarding implementing this. In the 2nd pass processing macro, I would expect you would want to specify the bitmask in the lines:

/rat/proc datacleaning /rat/procset mask "default"

The example macro data_cleaning_cut.mac also has this line. But, it really seems that the cleaning of data with data_cleaning_cut.mac is performed with the lines:

/rat/proc dataCleaningCut /rat/procset flag "ringoffire" #"Dirty" data will have the ringoffire bit set

Does the "default_apply" mask need to be changed in only the 2nd processing macro? Or should I also change it in data_cleaning_cut.mac where /rat/proc datacleaning shows up a second time?

knapik commented 7 years ago

in the example macro data_cleaning_cut.mac, it assumes that no data cleaning had been applied yet (since the example was on generated data, there was no way to apply it). Therefore, for the conditional, dataCleaningCut processor to work, the macro had to run /rat/proc datacleaning first. Since you will be inroot-ing a processed file that already has had the datacleaning applied, there is no need (and you probably shouldn't) run /rat/proc datacleaning /rat/procset mask "default"

ok, the whole processing chain is still not well documented and still not 100% figured out, but here is my interpretation of the "best" way to do things. I think 3 macros are needed (you might be able to get away with only 2, but for clarity 3 is the way to go).

Macro 1: Nothing but pmt calibration and tpmuonfollower pass 1: acts on raw zdab, first_pass_data_cleaning.mac is what this macro should be

Macro 2: Here things get a bit more user specific we still run on the raw zdab, but now we will output a processed root file. The full set of datacleaning should be run here, so the mask should be "default_apply"

For our purposes, we probably do not need to run all the dataQuality processor or even the reconstruction for many of our early studies. The bare minimum in this macro needs to be /rat/proc calibratePMT /rat/proc datacleaning /rat/procset mask "default_apply" /rat/procset add "tpmuonfollowercut" /rat/procset pass 2 /rat/proclast outroot /rat/procset SNOP_00000XXX_YYY_processed.root

But including all the processors from processing.mac is something we would eventually want to do as well. It might be nice to be able to set a option in config/config.py to either run just the bare minimum in this macro, or to run the full suite (the reconstruction takes forever!).

Macro 3: This macro now acts on the root file outputted from Macro 2. Here you run the DataCleaningCut, to filter the data into "clean/dirty" sets. The "analysis_mask" should be used here (or the "muon_study", or the "flasher_study" or whatever bitmask desired). You can/should also do the DCA processing in this macro.
`

EVENT LOOP

/rat/proc frontend # only needed for simulation

/rat/proc trigger # only needed for simulation

/rat/proc eventbuilder # only needed for simulation

/rat/proc calibratePMT # All ready been run in Macro 2

/rat/proc datacleaning #All ready been run in Macro 2

/rat/procset mask "default" #All ready been run in Macro 2

/rat/proc dcaProc /rat/procset type "timediff" /rat/procset type "flagged" /rac/procset file "SNOP_0000013332_000_processed_dcaProc.root"

/rat/proc/if dataCleaningCut /rat/procset flag "analysis_mask"
/rat/proc outroot /rat/procset file "SNOP_0000013332_000_processed_analysis_mask_clean.root" /rat/proc/else /rat/proc outroot /rat/procset file "SNOP_0000013332_000_processed_analysis_mask_dirty.root" /rat/proc/endif

END EVENT LOOP

` Where I have explicitly commented out processors that are not needed in this macro.

Hopefully this is understandable, if not, question me some more.

pershint commented 7 years ago

These changes make sense, thanks for the input. I've added them and should merge the new changes by the end of the day.

A note though; when running the datacleaning macro (CleanData.mac) on rootfiles that were processed on the grid, I run into errors if I don't have the /rat/proc/ datacleaning, /rat/procset/ mask "default_apply" lines in. Specifically, each event gives me:

ConditionalProcBlock::DSEvent: Conditional processor DataCleaningCutProc did not return OKTRUE or OKFALSE. Processor execution will skip this ConditionalProcBlock and continue.

Is this expected for data that has been processed on the grid?

knapik commented 7 years ago

I have seen this problem too and I am trying to track down why it is happening. I am pretty sure it is happening because the applied data cleaning flags are not being set properly in all events. I think I have narrowed it down to a problem with TPMuonFollower, but I am still looking at it.

If you take out the tpmuonfollower-short flag out of "analysis_mask" the problem should go away. Can you verify this? It may or may not have something to do with Morgan's addition to the code (i.e. the muon at the start of the run). I hopefully will track things down today ...

pershint commented 7 years ago

I just went and removed the "tpmuonfollower-short" flag from the analysis_mask and the flood of ConditionalProcBlock messages still happens.

As a note, this is the case with files that I have processed with the DCMacs code as well. If I have the /rat/proc/ datacleaning, /rat/procset/ mask "default_apply" lines in, the files split into clean and dirty subsets fine.

Have you had any more luck figuring out why CleanData.mac needs the call to /rat/proc datacleaning even if you're running it on a file that has already been processed?

knapik commented 7 years ago

Yes I have figured out what the underlying bug is (see https://github.com/snoplus/rat/issues/1664 and the references within). I will be making a pull request today to fix it. A quick fix that you can use for testing is to add the following before line 62 in src/analysis/DataCleaningCutProc.cc

fPass=fPass-1

The problem is that when you run the macro on an already processed file, the passNumber gets incremented, but the datacleaning bits are stored under the old pass number (i.e. fPass -1). When you run the datacleaning before DataCleaningCut conditional statement, the datacleaning bits get set for the current pass so it works.

The whole fPass thing is a bit confusing and I am not sure it is actually a good feature to have in the code base. But it is there , so we will have to live with it.

pershint commented 7 years ago

Ah, that makes sense. I may add that line just so I can check the code I have written all works as expected once that pull request goes in.

Thanks for the help Rob! I'll test all of this one last time this afternoon on some of the newer data and close the issue once I've pushed the changes and it all works.

knapik commented 7 years ago

Sounds great. Thank you for making this tool! It will be very useful in the coming weeks and months. Did you see this document: https://docs.google.com/document/d/1l7AZeDYmGmfniMwh6YUD7MU8l7xyerpqB42lG3sbql4/edit#Comparing the datacleaning from runs: 15143 - 15146 to runs: 15153 - 15159 might be interesting.

pershint commented 7 years ago

Wanted to give a quick update; I tried adding in the fPass=fPass-1; line, and now I cannot successfully run CleanData.mac whether or not the /rat/proc datacleaning, /rat/procset mask "default_apply" lines are there.

I may just run the code with the lines included in CleanData.mac to get the job done until your permanent fix is put in. I can check again once that pull request is merged.

And I did see those; I will run the cleaning and occupancy proc on them today.

knapik commented 7 years ago

I just made the pull request (https://github.com/snoplus/rat/pull/1665) , but it also includes other changes that will not really work well until Ian makes his changes to the blindness processor (https://github.com/snoplus/rat/pull/1602 ).

I would go ahead and just work with the hacked up code you have and start making some plots.

pershint commented 7 years ago

Wanted to give an update on this after seeing your e-mail for the DC Blindness issue. I pulled RAT in it's current state as of now, and I still cannot successfully run the processor that splits data into "clean" and "dirty subsets without the /rat/proc datacleaning, /rat/procset mask "default_apply" lines in the macro.

I do have a follow-up question. In the processing.mac script (after generating the tpmuonfollower.json file), the macro reads:

/rat/proc calibratePMT /rat/proc datacleaning /rat/procset mask "default_apply" /rat/procset add "tpmuonfollowercut" /rat/procset pass 2 /rat/proclast outroot ...

Is it a possibility that we have to redo the /rat/proc datacleaning, /rat/procset mask "default_apply" lines in cleanData.mac because we have moved to "pass 2"?

pershint commented 7 years ago

Actually, I talked with Morgan about this and he said that he has a fix that will address this issue soon. I'll get back to you after Morgan pushes that hotfix.

knapik commented 7 years ago

I think once the changes in https://github.com/snoplus/rat/pull/1681 finally go in this week, we will have a much more transparent way to do things. The macros will then, hopefully, be more or less fixed for all of waterFill and then we will just be modifying the bitmasks and parameters.