cms-gem-daq-project / gem-light-dqm

GEM light DQM code
0 stars 17 forks source link

Long Processing Time For Events #15

Closed bdorney closed 6 years ago

bdorney commented 7 years ago

I've noticed there's a long processing time for runs which somewhat impedes rapid turn-around from data-taking to results. For example run000029 finished yesterday at 21h41:

-rw-r-----. 1 GE11COSMICS gemuser 7.2M Nov 27 21:41 run000029_LatencyScan_TIF_2016-11-27_chunk_691.dat -rw-r-----. 1 GE11COSMICS gemuser 3.6M Nov 27 21:41 run000029_LatencyScan_TIF_2016-11-27_chunk_692.dat -rw-r-----. 1 GE11COSMICS gemuser 3.6M Nov 27 21:41 run000029_LatencyScan_TIF_2016-11-27_chunk_693.dat

However this morning it is still processing ~11 hours after the fact (current time is 8h21):

[GE11COSMICS@cosmicstandtif ~]$ ls -lhtr $GEM_DATA_DIR/run000029* | grep -v "chunk" -rw-r--r--. 1 GE11COSMICS GE11COSMICS 11M Nov 28 08:21 /data/bigdisk/GEM-Data-Taking/GE11_QC8//run000029_LatencyScan_TIF_2016-11-27.analyzed.root -rw-r--r--. 1 GE11COSMICS GE11COSMICS 98M Nov 28 08:21 /data/bigdisk/GEM-Data-Taking/GE11_QC8//run000029_LatencyScan_TIF_2016-11-27.raw.root -rw-r--r--. 1 GE11COSMICS GE11COSMICS 790M Nov 28 08:21 /data/bigdisk/GEM-Data-Taking/GE11_QC8//run000029_LatencyScan_TIF_2016-11-27.dat

This is additionally troublesome if runs are taken together. Right now run000028 has finished but hasn't started processing yet:

[GE11COSMICS@cosmicstandtif ~]$ ls -lhtr $GEM_DATA_DIR/run000028* ... ... ... -rw-r-----. 1 GE11COSMICS gemuser 495K Nov 27 21:07 /data/bigdisk/GEM-Data-Taking/GE11_QC8//run000028_LatencyScan_TIF_2016-11-27_chunk_555.dat

[GE11COSMICS@cosmicstandtif ~]$ ls -lhtr $GEM_DATA_DIR/run000028* | grep -v "chunk" [GE11COSMICS@cosmicstandtif ~]$

Discussed briefly with Jared maybe multiple file I/O operations are slowing things down? Could we see an improvement by either increasing the number of events in a "chunk" file or having the daemon merge X chunk files together and then start processing from this new "merged" set of files? Alternatively for LS2 should we invest in a multicore machine dedicated to this purpose (say n_core >= 16)?

For the meantime will process desired runs by hand (cat'ing together the "chunk" files).

mexanick commented 6 years ago

Close as outdated