HEP-FCC / heppy

[deprecated] A python analysis framework for high energy physics
Other
11 stars 32 forks source link

add script to diagnose the merged output heppy file validity #84

Closed davidjamin closed 6 years ago

fcc-bot commented 6 years ago

Can one of the admins verify this patch?

clementhelsens commented 6 years ago

Hi @cbernet,

David added this script because we found a large fraction of corrupted files after the heppy_hadd step. So this flags when it produces a corrupted merged files so that we can proceed again with the merging. Have you ever experienced such problems? @davidjamin , could you please post the error that we observe when the file is corrupted?

Cheers, Clement

davidjamin commented 6 years ago

Hi @cbernet , as said by @clementhelsens I put here few lines of the error happening on a problematic event : ... R__unzip: error -5 in inflate (zlib) Error in : fNbytes = 30287, fKeylen = 94, fObjlen = 31904, noutot = 0, nout=0, nin=30193, nbuf=31904 file probably overwritten: stopping reporting error messages Error in : File: FCChhAnalyses/output/BDT_v02_mvaQCD/Chunk_backup/Zprime_tt/pp_Zprime_20TeV_ttbar.bad/heppy.FCChhAnalyses.Zprime_tt.TreeProducer.TreeProducer_1/tree.root at byte:90294230, branch:Jet1_trk02_Corr_MetCorr_e, entry:111658, badread=1, nerrors=10, basketnumber=13 R__unzip: error -5 in inflate (zlib) Error in : fNbytes = 30287, fKeylen = 94, fObjlen = 31904, noutot = 0, nout=0, nin=30193, nbuf=31904 ...

it can happen that such error crash our code. The file used for the above error is here :

/afs/cern.ch/user/d/djamin/fcc_work/heppy/FCChhAnalyses/output/BDT_v02_mvaQCD/Chunk_backup/Zprime_tt/pp_Zprime_20TeV_ttbar.bad/heppy.FCChhAnalyses.Zprime_tt.TreeProducer.TreeProducer_1/tree.root

Cheers, David

cbernet commented 6 years ago

Hi David,

heppy_hadd is just using ROOT’s hadd. Can you try hadd directly on your root files?

Cheers,

Colin

Le 7 févr. 2018 à 14:57, davidjamin notifications@github.com a écrit :

Hi @cbernet https://github.com/cbernet , as said by @clementhelsens https://github.com/clementhelsens I put here few lines of the error happening on a problematic event : ... R__unzip: error -5 in inflate (zlib) Error in TBasket::ReadBasketBuffers: fNbytes = 30287, fKeylen = 94, fObjlen = 31904, noutot = 0, nout=0, nin=30193, nbuf=31904 file probably overwritten: stopping reporting error messages Error in TBranch::GetBasket: File: FCChhAnalyses/output/BDT_v02_mvaQCD/Chunk_backup/Zprime_tt/pp_Zprime_20TeV_ttbar.bad/heppy.FCChhAnalyses.Zprime_tt.TreeProducer.TreeProducer_1/tree.root at byte:90294230, branch:Jet1_trk02_Corr_MetCorr_e, entry:111658, badread=1, nerrors=10, basketnumber=13 R__unzip: error -5 in inflate (zlib) Error in TBasket::ReadBasketBuffers: fNbytes = 30287, fKeylen = 94, fObjlen = 31904, noutot = 0, nout=0, nin=30193, nbuf=31904 ...

it can happen that such error crash our code. The file used for the above error is here :

/afs/cern.ch/user/d/djamin/fcc_work/heppy/FCChhAnalyses/output/BDT_v02_mvaQCD/Chunk_backup/Zprime_tt/pp_Zprime_20TeV_ttbar.bad/heppy.FCChhAnalyses.Zprime_tt.TreeProducer.TreeProducer_1/tree.root

Cheers, David

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HEP-FCC/heppy/pull/84#issuecomment-363776518, or mute the thread https://github.com/notifications/unsubscribe-auth/AD8ku97SBxpBIym9caKpgzKY_Wo7JPK0ks5tSavigaJpZM4R8m9w.

davidjamin commented 6 years ago

Hi Colin,

yes I did and the output is OK but I am not sure to reproduce or fix the issue with such a test.

When we use our script to merge the files it can happen that some of the produced merged files have corrupted events but it doesn't happen every time (and later on when we run on these files, the codes are crashing). The goal of my script is to find the corrupted merged files. Then I re-run our merging script to make these specific files again and the new files are OK after a second try with the exact same merging script (it has never been needed to run a 3rd time).

I am not sure that using hadd directly is preventing the issue : I do once hadd command and the output is fine but I do not learn anything.

In a second hand, I am not sure how to spot what is not working in our merging script. Cheers, David.

On 07/02/2018 15:42, Colin Bernet wrote:

Hi David,

heppy_hadd is just using ROOT’s hadd. Can you try hadd directly on your root files?

Cheers,

Colin

Le 7 févr. 2018 à 14:57, davidjamin notifications@github.com a écrit :

Hi @cbernet https://github.com/cbernet , as said by @clementhelsens https://github.com/clementhelsens I put here few lines of the error happening on a problematic event : ... R__unzip: error -5 in inflate (zlib) Error in TBasket::ReadBasketBuffers: fNbytes = 30287, fKeylen = 94, fObjlen = 31904, noutot = 0, nout=0, nin=30193, nbuf=31904 file probably overwritten: stopping reporting error messages Error in TBranch::GetBasket: File: FCChhAnalyses/output/BDT_v02_mvaQCD/Chunk_backup/Zprime_tt/pp_Zprime_20TeV_ttbar.bad/heppy.FCChhAnalyses.Zprime_tt.TreeProducer.TreeProducer_1/tree.root at byte:90294230, branch:Jet1_trk02_Corr_MetCorr_e, entry:111658, badread=1, nerrors=10, basketnumber=13 R__unzip: error -5 in inflate (zlib) Error in TBasket::ReadBasketBuffers: fNbytes = 30287, fKeylen = 94, fObjlen = 31904, noutot = 0, nout=0, nin=30193, nbuf=31904 ...

it can happen that such error crash our code. The file used for the above error is here :

/afs/cern.ch/user/d/djamin/fcc_work/heppy/FCChhAnalyses/output/BDT_v02_mvaQCD/Chunk_backup/Zprime_tt/pp_Zprime_20TeV_ttbar.bad/heppy.FCChhAnalyses.Zprime_tt.TreeProducer.TreeProducer_1/tree.root

Cheers, David

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HEP-FCC/heppy/pull/84#issuecomment-363776518, or mute the thread https://github.com/notifications/unsubscribe-auth/AD8ku97SBxpBIym9caKpgzKY_Wo7JPK0ks5tSavigaJpZM4R8m9w.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HEP-FCC/heppy/pull/84#issuecomment-363789607, or mute the thread https://github.com/notifications/unsubscribe-auth/AhloHnErOldBXqYX7TxCmHXPkJmcQAwjks5tSbZHgaJpZM4R8m9w.

cbernet commented 6 years ago

Hi David,

Can you simply keep the script in some other place? like the hh analysis package? It is the first time I hear about this problem, and it is definitely related to hadd (and/or your input files).

Cheers,

Colin

Le 8 févr. 2018 à 09:54, davidjamin notifications@github.com a écrit :

Hi Colin,

yes I did and the output is OK but I am not sure to reproduce or fix the issue with such a test.

When we use our script to merge the files it can happen that some of the produced merged files have corrupted events but it doesn't happen every time (and later on when we run on these files, the codes are crashing). The goal of my script is to find the corrupted merged files. Then I re-run our merging script to make these specific files again and the new files are OK after a second try with the exact same merging script (it has never been needed to run a 3rd time).

I am not sure that using hadd directly is preventing the issue : I do once hadd command and the output is fine but I do not learn anything.

In a second hand, I am not sure how to spot what is not working in our merging script. Cheers, David.

On 07/02/2018 15:42, Colin Bernet wrote:

Hi David,

heppy_hadd is just using ROOT’s hadd. Can you try hadd directly on your root files?

Cheers,

Colin

Le 7 févr. 2018 à 14:57, davidjamin notifications@github.com a écrit :

Hi @cbernet https://github.com/cbernet , as said by @clementhelsens https://github.com/clementhelsens I put here few lines of the error happening on a problematic event : ... R__unzip: error -5 in inflate (zlib) Error in TBasket::ReadBasketBuffers: fNbytes = 30287, fKeylen = 94, fObjlen = 31904, noutot = 0, nout=0, nin=30193, nbuf=31904 file probably overwritten: stopping reporting error messages Error in TBranch::GetBasket: File: FCChhAnalyses/output/BDT_v02_mvaQCD/Chunk_backup/Zprime_tt/pp_Zprime_20TeV_ttbar.bad/heppy.FCChhAnalyses.Zprime_tt.TreeProducer.TreeProducer_1/tree.root at byte:90294230, branch:Jet1_trk02_Corr_MetCorr_e, entry:111658, badread=1, nerrors=10, basketnumber=13 R__unzip: error -5 in inflate (zlib) Error in TBasket::ReadBasketBuffers: fNbytes = 30287, fKeylen = 94, fObjlen = 31904, noutot = 0, nout=0, nin=30193, nbuf=31904 ...

it can happen that such error crash our code. The file used for the above error is here :

/afs/cern.ch/user/d/djamin/fcc_work/heppy/FCChhAnalyses/output/BDT_v02_mvaQCD/Chunk_backup/Zprime_tt/pp_Zprime_20TeV_ttbar.bad/heppy.FCChhAnalyses.Zprime_tt.TreeProducer.TreeProducer_1/tree.root

Cheers, David

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HEP-FCC/heppy/pull/84#issuecomment-363776518, or mute the thread https://github.com/notifications/unsubscribe-auth/AD8ku97SBxpBIym9caKpgzKY_Wo7JPK0ks5tSavigaJpZM4R8m9w.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HEP-FCC/heppy/pull/84#issuecomment-363789607, or mute the thread https://github.com/notifications/unsubscribe-auth/AhloHnErOldBXqYX7TxCmHXPkJmcQAwjks5tSbZHgaJpZM4R8m9w.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HEP-FCC/heppy/pull/84#issuecomment-364043903, or mute the thread https://github.com/notifications/unsubscribe-auth/AD8ku7U_SO48gdx3o2yEOWsFzHVTx248ks5tSrZQgaJpZM4R8m9w.

davidjamin commented 6 years ago

OK I move it in hh analysis package