JeffersonLab / halld_recon

Reconstruction for the GlueX Detector
7 stars 9 forks source link

Issues with CPP geometry #669

Closed sdobbs closed 2 years ago

sdobbs commented 2 years ago

For some reason when I switch from the normal GlueX geometry to

JANA_GEOMETRY_URL = ccdb:///GEOMETRY/cpp_HDDS.xml

about 9 out of 10 times running hd_root crashes. This only happens when running multithreaded, and I've only tested it on the gluons. Maybe there's something wrong with my software environment, but thought I would report this in case I'm not the only one.

An example command line that causes this: hd_root --nthreads=20 -PTHREAD_TIMEOUT_FIRST_EVENT=300 -PTRKFIT:HYPOTHESES_POSITIVE=8 -PTRKFIT:HYPOTHESES_NEGATIVE=9 -PPLUGINS=HLDetectorTiming,monitoring_hists /gluonraid2/rawdata/volatile/RunPeriod-2022-05/rawdata/Run100411/hd_rawdata_100411_001.evio -PHLDETECTORTIMING:NO_START_COUNTER=1

nsjarvis commented 2 years ago

My jobs are completing. I'm using version set 5.1.0 (I hope that is not a mistake), CCDB from yesterday morning and the following:

JANA_GEOMETRY_URL=ccdb:///GEOMETRY/main_HDDS.xml . hd_root hd_rawdata_100373_001.evio -PPLUGINS=CDC_amp,CDC_TimeToDistance,CDC_Efficiency -PNTHREADS=32 -PHLDETECTORTIMING:NO_START_COUNTER=1 -o rootfile.root

staylorjlab commented 2 years ago

That geometry url has the target in the wrong place. Instead of main_HDDS.xml use cpp_HDDS.xml.

nsjarvis commented 2 years ago

Ah. Actually that's what I intended to use but I guess it got reset before the job started. Thanks!

nsjarvis commented 2 years ago

With the cpp_HDDS.xml I get this JANA >> HLDETECTORTIMING:NO_START_COUNTER = 1 <-- NO DEFAULT! (TYPO?)

and many JANA errors about the start counter.

JANA ERROR>>Start counter paddle resolutions table has wrong number of entries:
JANA ERROR>>  loaded = 30  expected = 0
JANA ERROR>>Cannot load Start Counter geometry information!

and then about the target

src/JANA/JGeometryXML.cc:347 Node or attribute not found for xpath "//composition[@name='targetVessel']/posXYZ[@volume='targetTube']/@X_Y_Z".

but the file is processing ok.

sdobbs commented 2 years ago

Yeah, there are a lot of non-critical error messages in parts of the code that load the start counter - but the code works fine. We should change this so that those sections actually check to see if the SC is in the geometry or not...

sdobbs commented 2 years ago

These crashes seem to be just related to HLDetectorTiming - good thing I'm planning on rewriting it anyway...

sdobbs commented 2 years ago

I will note that these have mysteriously gone away, so I'm closing the issue