eic / eic-spack

Spack packages for the Electron Ion Collider
3 stars 6 forks source link

PYTHIA6 (pythiaeRHIC) Support in ePIC Containers #598

Open bspage912 opened 10 months ago

bspage912 commented 10 months ago

Is your feature request related to a problem? Please describe. Currently, the generation of PYTHIA6 events using the pythiaeRHIC MCEG (used for example for minBias samples for low Q2 and background studies) requires code and an environment maintained at BNL. It would be good for reproducibility sake to be able to run this more readily from the ePIC environment.

Describe the solution you'd like One should be able to run pythiaeRHIC and read and manipulate (convert to HepMC for example) the resulting text files / root trees using the ePIC environment, independent of having access to the BNL EIC computing environment.

Describe alternatives you've considered Maybe pythiaeRHIC could be maintained as a separate container instead of within the ePIC containers?

Additional context This must be an issue for other MCEGs used by ePIC. It is certainly not practical to include all MCEGs in the ePIC container, but how do we both keep track/vet what is being generated as official ePIC simulation and ensure the generators are available to all?

bspage912 commented 10 months ago

@kkauder @wdconinc

wdconinc commented 10 months ago

Apart from the technical challenges of supporting 20 year old software inside containers, you don't get reproducibility simply by sticking the local directory of pythia6eRHIC in a container, just like uploading that directory to gitlab didn't magically solve a decade of poor practices (even despite active efforts by the EICUG SWG to make sure event generators are properly managed).

Both of those approaches will give you merely a snapshot that's invalid or suspect from the moment anything in that local directory is changed again, which means that you might as well just take a copy of that directory when you need it or use what's there on gpfs: it's the only way to run simulations that will be deemed acceptable. This will only change if the proponents of this pythia6eRHIC event generator demonstrate a greater interest in wanting this to be reproducible outside of gpfs, and they actively support or even participate in changing the actual reference location to elsewhere.

Until that happens as a first step, I don't think it makes sense to install 700MB of cernlib and pythia6 static libs inside the containers that are not validated and not likely to be used.

Other event generators are better managed from the start, with reference source repositories, clear licensing, versions, bug reports, and developer communities online. That applies to pythia8, Sherpa, Herwig, starlight, estarlight, lager, Sartre, dpmjet, synrad, etc.

kkauder commented 10 months ago

Just as a partial counterpoint, Sylvester's nanocernlib https://github.com/sly2j/nanocernlib.git is completely sufficient.

bspage912 commented 10 months ago

Well, I can't comment on the feasibility of maintaining pythiaeRHIC within a container or just having it in a repository and compiling it within the environment. If it isn't feasible, so be it.

While not a guarantee of reproducibility, having this MC in a container or repository would make things more transparent and available to the rest of the collaboration. Also, this MC is not in active development, so I can't imagine it would be too hard to keep the repository and gpfs versions in synch (even if it is not ideal).

I (or whoever) can generate the new set of minbias / low-Q2 locally on gpfs no problem, I'm just trying to think of ways to make this more standardized. Even if we had somewhere to store say the steering files used, that would help.

wdconinc commented 10 months ago

Even if we had somewhere to store say the steering files used, that would help.

That's really where this should start. If we give the entire collaboration just a Monte Carlo generator then we're going to get a lot of wrong samples. My feeling about this is that there are two ways to make this work:

Right now we're in the former, and just installing pythia6 doesn't get us to the latter, but puts us in the gap between the two where it is easy to think you have generated a valid sample, but you haven't.

kkauder commented 10 months ago

Just a comment because you make it sound like the code is somewhere hidden. It is not, it lives at https://gitlab.com/eic/mceg/PYTHIA-RAD-CORR

wdconinc commented 10 months ago

you make it sound like the code is somewhere hidden

You'll note that I linked that in the first sentence in my first reply. The problem is, as I wrote from the start: just uploading something to gitlab is not a solution. Do we want everyone to use the 2-year old STEER-FILES-Official files there that are using the CTEQ61 pdfs that give the wrong cross-section results (ignoring for a moment that lhapdf doesn't even distribute CTEQ5L anymore because it's been superseded for 15 years)? Who is maintaining this repository and vouching for its correctness? Importantly, are either Elke or Ralf using a local clone of this repository, or is this gitlab repository just a fork of one of their directories at some point in time, and has it potentially diverged since then? These are the questions that need to be addressed before we should give this to users to shoot themselves in the foot with.

wdconinc commented 10 months ago

Transferred to eic-spack.

wdconinc commented 10 months ago

The next step here is to write a package similar to https://github.com/eic/eic-spack/blob/develop/packages/pythia6m/package.py so we can install it in the container as soon as there are proper version tags in the gitlab repo.