art-framework-suite / art-root-io

0 stars 2 forks source link

Output file produced by graceful shutdown is not readable #13

Open kutschke opened 2 years ago

kutschke commented 2 years ago

Describe the bug Mu2e has a job that throws a cet::exception and we have a fix to the underlying problem. When debugging this problem we discovered that the output file is not readable. We understand that the output file is expected to be readable and to contain the events up to, but not including, the event that failed. The job log is at:

/mu2e/app/users/kutschke/Bugs/Dave_crash/out/ceSimReco.log

and shows the output expected from a graceful shutdown following the throw of a cet::exception. The output file is at:

/mu2e/app/users/kutschke/Bugs/Dave_crash/out/RootOutput-821d-e66d-eacd-2b9e.root

To Reproduce On a machine that mounts

source /cvmfs/mu2e.opensciencegrid.org/setupmu2e-art.sh
git clone https://github.com/Mu2e/Offline
git clone https://github.com/Mu2e/Production
cd Offline
git checkout -b test  555416e
cd ../Production
git checkout -b test d7836d67
cd ..
muse setup
muse build -j 24  # or the right number of threads for you machine
mkdir out         # Or symlink to your data disk
cp /mu2e/app/users/kutschke/Bugs/Dave_crash/ceSimReco.fcl .
mu2e -c ceSimReco.fcl >& out/ceSimReco.log   # ~13 minutes real time on mu2ebuild01 to reach crash
eventCount out/RootOutput*.root              # This works
mu2e -c Offline/Print/fcl/events.fcl  -s out/RootOutpu*.root   # This seg faults

If this is a high-priority issue This is not a high priority for mu2e.

knoepfel commented 1 year ago

@kutschke, can you provide us with the GitHub link to the events.fcl file?

kutschke commented 1 year ago

The file is:

https://github.com/Mu2e/Offline/blob/main/Print/fcl/events.fcl

kutschke commented 1 year ago

A possibly simpler example to reproduce the failure is:

process_name : HelloWorld

source : {
  module_type : RootInput
  maxEvents : 3
}

physics :{
  analyzers: {
    hello: {
      module_type : HelloWorld
    }
  }
  e1 : [hello]
  end_paths      : [e1]
}

where the source for the module is : https://github.com/Mu2e/Offline/blob/main/HelloWorld/src/HelloWorld_module.cc

knoepfel commented 1 year ago

Thanks, @kutschke. We'll investigate.