star-bnl / star-sw

Core software for STAR experiment
26 stars 63 forks source link

Introduce FXT chain option to enable anything FXT-specific #161

Closed genevb closed 2 years ago

genevb commented 2 years ago

FXT chain option and database flavor will facilitate calibrations of fixed target datasets which were interspersed between other non-FXT dataset in time during BES-II data-taking.

Note: the TPC group has already been using this flavor, triggered by particular test on a beam conditions database. This chain option will provide a more robust switch to ensure the use of desired database tables and any other actions of interest in the code.

plexoos commented 2 years ago

Does BFChain support multivalue options? It could be more natural for the user to specify something like flavor=fxt or flavor=sim (or maybe even flavor=[fxt,sim] when it makes sense)

genevb commented 2 years ago

Hi, Dmitri

If one uses both the "simu" and "fxt" chain options, then both flavors (sim+fxt) of database entries will be enabled.

-Gene

On Oct 4, 2021, at 7:44 PM, Dmitri Smirnov @.**@.>> wrote:

Does BFChain support multivalue options? It could be more natural for the user to specify something like flavor=fxt or flavor=sim (or maybe even flavor=[fxt,sim] when it makes sense)

klendathu2k commented 2 years ago

Hi Gene, I think that Dmitri may be asking whether the chain options support keyword=value type chain options. That would be a nice feature for a future release, but it's not implemented at present. Cheers, Jason

Hi, Dmitri If one uses both the "simu" and "fxt" chain options, then both flavors (sim+fxt) of database entries will be enabled.

plexoos commented 2 years ago

Yes, Jason is correct. My concern is about introducing another 567-th option without a very clear intent (i.e. "enable anything FXT-specific"). The existing collection of options is already too "flat" and impossible for a user to grasp. Using a multi-value or multi-choice options where appropriate is one way to bring structure and be more explicit about the intended usage. I believe the syntax keyword=value (or keyword:value?) is supported to some extent as I think something like this was used for the PicoDst maker

genevb commented 2 years ago

Hi,

We certainly have things like "DbV20210927" which are essentially saying "DbV" = "20210927", and there are at least a couple other things like this, but they are specifically coded as such.

As for the unclear intent, I've tried to provide a chain option that would be an umbrella for FXT data, just as we have a "ppOpt" that covers most pp options. We could name it "FXTOpt" if you want to try to make it more similar to that. But I do think that "FXT" is a name with a pretty clear purpose: processing FXT data. If you want to rename this chain option to be something like "FXTflavor" and specifically only for turning on the "FXT" database flavor, then that implies introducing yet more chain options later if there are other desired switches needed specifically for FXT. My preference is to have a single switch for all things FXT.

BFC chain options are a compromise between being overt about every switch that is turned on and reducing things to a small set of aliases that control many switches. There is no perfect place in such a landscape: neither extreme is appropriate, and no middle-ground compromise will appease everyone.

Personally, I'd prefer that the data tell the code that it is FXT and codes could react accordingly, and thus no chain option would be needed. Unfortunately, I'm not aware of a robust way to do this for existing data. Jeff Landgraf noted to me that the trigger setup name indicates that it is fixed target, but that information is not in the DAQ data from what I can tell, nor in the offline database, and instead would involve a query to a backup of an online database, which I'm not sure is available outside of BNL for remote processing such as embedding. It's also preferable to set database flavors before any database queries during production so that there's no risk of an improperly flavored query sneaking through before flavors are set. So for now I think we're reliant on a chain option for a robust approach.

-Gene

On Oct 5, 2021, at 12:14 PM, Dmitri Smirnov @.**@.>> wrote:

Yes, Jason is correct. My concern is about introducing another 567-th option without a very clear intent (i.e. "enable anything FXT-specific"). The existing collection of options is already too "flat" and impossible for a user to grasp. Using a multi-value or multi-choice options where appropriate is one way to bring structure and be more explicit about the intended usage. I believe the syntax keyword=value (or keyword:value?) is supported to some extent as I think something like this was used for the PicoDst maker

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/star-bnl/star-sw/pull/161#issuecomment-934555082, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AUK2OBOAZ5JAGLYO36RZY7DUFMP5PANCNFSM5FKCQT4Q. Triage notifications on the go with GitHub Mobile for iOShttps://urldefense.com/v3/__https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675__;!!P4SdNyxKAPE!QpWinbWHI3uV5bArHzUZ1DQfnZnbBXWKgEA1wkeVR1_JYzqc0RtDfMH2Vcze$ or Androidhttps://urldefense.com/v3/__https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign*3Dnotification-email*26utm_medium*3Demail*26utm_source*3Dgithub__;JSUlJSU!!P4SdNyxKAPE!QpWinbWHI3uV5bArHzUZ1DQfnZnbBXWKgEA1wkeVR1_JYzqc0RtDfO9RBcpk$.

genevb commented 2 years ago

I do want to state clearly that I recognize that introducing the FXT chain option is NOT the ideal thing to do. Yuri brought up at today's S&C meeting that introducing yet another chain option is introducing another point of possible failure if someone running a chain improperly includes or excludes the chain option (which is along the lines of what @plexoos was arguing that more chain options just expands the mess of chain options), and I think that's a valid concern. I also think that a database-query approach to turn on FXT flavors has its possibilities of failure as well, which is why determining that it is fixed target data from the DAQ stream data is really what we want.

So I'll spend another day or two looking into that possibility. Please don't merge this PR before I report back on it.

klendathu2k commented 2 years ago

Just thinking out loud...

I am not convinced that introducing FXT causes a large mess. Perhaps just a small one. But I may be overlooking something obvious, or have an assumption built in that is wrong.

In fixed target data sets, I assume the tracker must be aware of the z ~ 200 cm target position for track finding. Likewise, in collider data sets, I assume the tracker uses z ~ 0 for track finding. (The Sti kalman track finder dives towards the interaction point... I assume CA has cuts when linking neighboring hits that assume the IP as well.)

I would naturaly think that there would be no cases where FXT should be run without the tracker configured for z ~ 200 cm. And likewise, if FXT is absent the tracker should always be configured for z ~ 0 cm. So FXT should imply fixed target tracking behavior as well as database tables. Assuming these are behaviours are tied together in one FXT_MODE chain option...

If you include FXT_MODE when producing a collider data set, you should see a significant reduction in tracking efficiency. Likewise if you omit FXT_MODE when processing fixed target data, the efficiency should be pretty bad as well. I would like to think that this would be noticed when doing QA on test runs before scaling up to production.

On 2021-10-06 14:44, Gene Van Buren wrote:

I do want to state clearly that I recognize that introducing the FXT chain option is NOT the ideal thing to do. Yuri brought up at today's S&C meeting that introducing yet another chain option is introducing another point of possible failure if someone running a chain improperly includes or excludes the chain option (which is along the lines of what @plexoos [1] was arguing that more chain options just expands the mess of chain options), and I think that's a valid concern. I also think that a database-query approach to turn on FXT flavors has its possibilities of failure as well, which is why determining that it is fixed target data from the DAQ stream data is really what we want.

So I'll spend another day or two looking into that possibility. Please don't merge this PR before I report back on it.

-- You are receiving this because your review was requested. Reply to this email directly, view it on GitHub [2], or unsubscribe [3]. Triage notifications on the go with GitHub Mobile for iOS [4] or Android [5].

Links:

[1] https://github.com/plexoos [2] https://github.com/star-bnl/star-sw/pull/161#issuecomment-936885386 [3] https://github.com/notifications/unsubscribe-auth/ANL4LVAI7I2QP2TEZQ2ZLB3UFSKKJANCNFSM5FKCQT4Q [4] https://urldefense.com/v3/__https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675__;!!P4SdNyxKAPE!T0L_Q5NqvysJGB_d6dgx-XVxoqeGvAci_52pMoWRENRCRSbyu4vznYDxYfnfPdM$ [5] https://urldefense.com/v3/__https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign*3Dnotification-email*26utm_medium*3Demail*26utm_source*3Dgithub__;JSUlJSU!!P4SdNyxKAPE!T0L_Q5NqvysJGB_d6dgx-XVxoqeGvAci_52pMoWRENRCRSbyu4vznYDxM4dVMVA$

genevb commented 2 years ago

Hi, Jason

In fixed target data sets, I assume the tracker must be aware of the z ~ 200 cm target position for track finding. Likewise, in collider data sets, I assume the tracker uses z ~ 0 for track finding. (The Sti kalman track finder dives towards the interaction point... I assume CA has cuts when linking neighboring hits that assume the IP as well.)

I am unaware of any such configuration of the track-finding for expected trajectories. Sounds like a reasonable suggestion for something to switch to with FXT processing (could be enabled by the proposed chain option from this PR), but I doubt it is implemented.

-Gene

starsdong commented 2 years ago

Dmitri,

could you help take a review and comment/approve this PR? Thanks

/xin

plexoos commented 2 years ago

Gene wrote:

Please don't merge this PR before I report back on it.

Was there a follow up?

genevb commented 2 years ago

Thanks for the ping....I haven't yet looked harder, but I will try to give it some time soon. -Gene

On Oct 25, 2021, at 12:55 PM, Dmitri Smirnov @.**@.>> wrote:

Gene wrote:

Please don't merge this PR before I report back on it.

Was there a follow up?

genevb commented 2 years ago

For alternatives to a chain option handles in StBFChain.cxx, I think there are two things to resolve:

1) How else to determine when the data is FXT? 1a) From a database query? 1b) From the DAQ file?

2) When else can the global FXT database flavor be set?


Focusing on the more straightforward question 2 first, we want to do it as early in reconstruction processes as possible, but it would need to be after starting the first event because run-specific information cannot be determined until that first event is read. So this cannot be in the constructor nor Init() of any maker, but could well be in the InitRun() of a maker early in the chain. From the suite of nightly tests, I see that reconstruction chains always begin with these 3 or 4 makers:

simulations: QA :INFO - StBFChain::bfc QA :INFO - St_geant_Maker::geant QA :INFO - St_db_Maker::db QA :INFO - StMagFMaker::MagField QA :INFO - StDetectorDbMaker::detDb

real data: QA :INFO - StBFChain::bfc QA :INFO - StIOMaker::inputStream QA :INFO - St_db_Maker::db QA :INFO - StMagFMaker::MagField QA :INFO - StDetectorDbMaker::detDb

St_geant_Maker and StIOMaker shouldn't be executing global database settings; St_db_Maker should be independent of such information; and StMagFMaker is too specific to the magnetic field. StDetectorDbMaker currently has no InitRun() function, but it does seem like an appropriate place to involve global database settings. So if we can resolve question 1, this would be where I would introduce the relevant code.


Yuri has advocated a method he implemented for question 1a as St_beamInfoC::IsFixedTarget() . I executed this query in the offline database for the RunLog_onl/beamInfo table and found 23512 matching runs scattered across all years of STAR data-taking. Some of these are likely to be harmless, such as this pedAsPhys run: https://online.star.bnl.gov/RunLogRun21/index.php?r=22189002

...but some of these are clearly runs for which we don't want to mistake them as FXT, such as this dAu run: https://online.star.bnl.gov/RunLogRun21/index.php?r=22188007

In order to claim this method is robust, one needs to sift through this mountain of runs and clearly understand for which ones it is harmless to call them FXT, plus resolve the mistaken false positive, plus ensure that there were no false negatives (FXT runs that weren't found to be FXT by this method), and establish confidence that it would hold true for any future data-taking. I do not have high confidence in this particular approach.

I did make on attempt to invent a possibly more robust DB query in the offline RunLog_onl database: select runNumber from L0TriggerInfo where name like "%ixed%" group by runNumber order by runNumber desc;

...but I found myself digging through numerous cases of false positives and false negatives again. There were even production FXT runs with no such-named triggers, such as: https://online.star.bnl.gov/RunLogRun18/index.php?r=19148046

In so doing, I came upon the production_15GeV_2014 runs, where I learned that we acquired FXT triggers not only in the same runs that we acquired beam-beam (non-FXT) triggers, but both were mixed together in the same DAQ stream files! Was this the only time we did this? And how can I be sure? Run-by-run information won't help in this case, and processing such files with different database queries for the non-FXT and FXT triggers within really does demand separate passes over the files for FXT and non-FXT events (there is event-filtering-by-triggers machinery during production that would serve this explicit need).

This last bit of information solidifies for me that we just don't have a robust mechanism available to objectively determine whether we are processing files specifically for FXT or non-FXT across all years' data other than through the BFC chain. The chain would, for example, also be the indication of whether we are filtering on FXT or non-FXT triggers in processing production_15GeV_2014 runs.

So after all that, I am back to supporting this PR, with the "FXT" chain option, as the best method I know of for handling FXT-specific processing.

-Gene

starsdong commented 2 years ago

Hi Gene,

Thank you for looking into this in great details. If indeed there is no robust way to extract the FXT run information from either DB or other source, I agree that the chain option seems to be a practical way to proceed. I share the concern about the complications in the chain options. But I see so far there are only limited scenarios here. FXT is particularly special as they are usually short and scattered during the data taken.

Dan Cebra has been following closely through the run on the change of beam energies. He put together nice summary reports on the run periods (first to last runs) for Run19-21 in his QA/triggerboard reports. I include them in the following link which may be useful https://drupal.star.bnl.gov/STAR/blog/dongx/run19-21-summary-dan-cebra

/xin

starsdong commented 2 years ago

Is there any further concern about this PR? I think we should merge this to main asap.

genevb commented 2 years ago

As I explained in my previous post in this PR, other methods are a game of chasing false positives and false negatives (#180 ).

starsdong commented 2 years ago

Dmitri, can you review/approve this PR and we can move forward? We certainly will need to check the nightly test and QA the data later.

nigmatkulov commented 2 years ago

Do we have a proposal of what exactly should be modified in which way? If not, IMO we should move on and then come back to it later if a solution which will be reached in some agreement will appear.

genevb commented 2 years ago

Hi, Grigory

Only Yuri is choosing not to follow the procedure set in place by this PR. For everyone except Yuri there is nothing else necessary to do - it is "solved", and we can move forward. The single downside with what we are doing is that people need to include "FXT" in the chain when processing FXT datasets, but this is not the only special chain option for FXT processing (e.g. "picoVtxFXT").

I have demonstrated how Yuri's choice is flawed, and, right on time, Yuri proceeded to demonstrate how its flaws lead to eternal maintenance by submitting another PR to deal with some of them (he seems unaware or uninterested that there are many more things to fix).

Hope that helps explain the situation, -Gene

On Nov 2, 2021, at 2:47 PM, nigmatkulov @.**@.>> wrote:

Do we have a proposal of what exactly should be modified in which way? If not, IMO we should move on and then come back to it later if a solution which will be reached in some agreement will appear.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/star-bnl/star-sw/pull/161#issuecomment-958043522, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AUK2OBIKSMBH53ZJBNNXS5TUKAW3JANCNFSM5FKCQT4Q. Triage notifications on the go with GitHub Mobile for iOShttps://urldefense.com/v3/__https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675__;!!P4SdNyxKAPE!Sp1EtQPT5eoKsaWe2JUNzDq7J7kRJXGqSh2v1P1WjREPjPYk3iZ_liB6GWJR$ or Androidhttps://urldefense.com/v3/__https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign*3Dnotification-email*26utm_medium*3Demail*26utm_source*3Dgithub__;JSUlJSU!!P4SdNyxKAPE!Sp1EtQPT5eoKsaWe2JUNzDq7J7kRJXGqSh2v1P1WjREPjPYk3iZ_lnCvRts_$.