eic / epic-analysis

General (SI)DIS analysis framework for the EIC
GNU Lesser General Public License v3.0
3 stars 9 forks source link

Hpc rework #289

Closed Gregtom3 closed 7 months ago

Gregtom3 commented 9 months ago

Briefly, what does this PR introduce?

Parallel computing capabilities with JLab's ifarm computing. Allows for multiple root files to be analyzed per job while maintaining proper Q2weighting across entire pipeline.

New script titled hpc/run-local-slurm-pipeline.rb creates the full pipeline that is run with a 'single button press'. After running the script, it will generate a new "run" script in hpc/project_scripts that will do the following for each campaign and beam energy.

  1. Creates default S3 .config file for the chosen campaign, detector configuration, and beam energy
  2. Splits the .config file into multiple batches, with a number of ROOT files per batch specified by the user
  3. Calculates the number of events for each Q2 range studied, either by reading the TTrees directly or calling their values from csvs in hpc/nevents_database. New files have their number of events automatically added to the database for faster access in the future.
  4. Manually calculate the Q2weights for each range and insert them into the batched .config files
  5. Run each .config file on the analysis macro specified by the user
  6. Merge the output ROOT Files

What kind of change does this PR introduce?

Please check if this PR fulfills the following:

Does this PR introduce breaking changes? What changes might users need to make to their code?

Does this PR change default behavior?

Yes. See Q2weight issue above.

cpecar commented 8 months ago

Testing it on ifarm now, but it might be nice to have an option similar to the energy settings to select only whatever Q2 ranges you're interested in (not sure if this would make the Q2 weighting a pain though)

cpecar commented 8 months ago
Screenshot 2023-12-21 at 10 58 00 AM

May want to increase the memory request for the parallel analysis jobs, they all failed due to running out of memory for me.

cpecar commented 8 months ago

Tested this again while being better at reading all your comments in the readme, and it seems like your default memory request should be okay up to 3 files per job.