GRIFFINCollaboration / detectorSimulations

GEANT4 simulation code for the GRIFFIN array and it's suite of ancillary detection systems.
7 stars 13 forks source link

Output Format #90

Open bkatiemills opened 10 years ago

bkatiemills commented 10 years ago

So - the core topic from today's meeting centered on what data format(s) our simulations should be outputting. Key points:

Things We Agreed On

Some points were (more or less) unanimous, so we'll take them as final decisions to start working from:

There is still some debate on what information we can and/or should put where:

Debate is still completely open on this topic, but here is one possible scheme we can start perturbing from until we get an optimal solution:

I believe this scheme covers all users without carrying around extra dead weight; @carlu, the step ntuple contains all the information you need for detector development in a format that should be familiar; @evan012345, I believe that the TigFragment tree will ultimately be the most usable for students (and everyone else), since users will have to be able to write analyses that can operate on this structure if they ever want to analyze real data, since this is what comes out of our sort code. Also keep in mind that a standard postprocessor that chews up TigFragments and spits out a simpler flat ntuple of meaningful physics parameters is a very viable product that we can produce to help users interpret the output of both GRSISpoon and detectorSimulations.

Other auxiliary points that we discussed to include in this first-order approximation are:

This is a high priority issue that needs to be resolved before we can move too far further. I expect the collaboration to be able to reach consensus by the end of April at the very latest, whereupon we will accept the best plan and move forward with it. Please propose and debate all changes in the comments, so we can keep track of everyone's input.

cc @AdamGarnsworthy @pcbend @damiller @christinaburbadge @moukaddam @evitts @r3dunlop

carlu commented 10 years ago

Hi everyone.

First off, I'm really sorry I missed the meeting. I was gain-matching BGOs and lost track of time. I don't enjoy gain-matching nearly as much as that suggests. Sorry to come in late with this, but I have an issue with one of your agreed points: "triggers should be downstream of the simulation package".

Consider the stable-beam TIP experiment S1232, which is about to begin. We have tons of beam and will be applying a really selective trigger to get the rate down to something the DAQ can handle. Probably 2 CsI and 1 HPGe in coincidence will be required to trigger data readout. If we decided to simulate this experiment without modelling the trigger the output file will be 90+% full of events which we would never see in the experiment. This would fill disk space and ultimately waste CPU and disk access time while we parse the file and strip out all of these events to reveal the simulated events we're interested in. I think the generation of coincidences before the output file is written could be a valuable tool.

I'm less concerned about adding energy resolution and I agree that this could be added later without any hit on performance. One point though, if we are going to want to look at Doppler broadening of gamma lines in TIGRESS runs we will have a natural width imposed to the peaks there anyway. So even without random broadening there will be some width to the peaks, why not make it the correct width?

carlu commented 10 years ago

I agree that the ntuple out described by @BillMills will satisfy application I can think of. The particular tasks I have in mind for the "true" G4 data are:

bkatiemills commented 10 years ago

@carlu the problem with applying cuts in simulation is that you create systematics that you are by definition blind to; you said the trigger for S1232 was 'probably' going to be xyz, but if those cuts are applied in Geant4 we will never know if xyz was a good choice of trigger or not. Trigger studies such as this are (should be!) one of the main points of doing simulations for every experiment; other analysis goals can be carried out on reduced data after the trigger is concluded, but not examining this major systematic for every experiment seems like a big mistake to me. Also, while in principle users can just 'be careful' to not throw away data they might want to look at, the reality in practice is that this just creates a loaded gun that people are going to shoot themselves with over and over again. Filtering data by a per-event trigger is by definition an O(N) process that ROOT is very good at and is trivial to parallelize; high-risk of needing to rerun simulations seem like too high a price to pay to avoid this.