Open valassi opened 1 month ago
(Then there is always the possibility that I am doing something really stupid...)
@valassi No, in the old version presented in the meetings only mll = 50 cut is implied for CMS in run_card - but since xqcut had been specified there would be automatic cuts on ptj due to auto_ptj_mjj?
@choij1589 Do you have ickkw=1? (not sure of the support of xqcut if ickkw=0) But yes in that mode you do have sensible cuts.
Now given the small statistical error this is likely not related to the cuts (if you have singularity, the error on the cross-section should be bigger than that).
@oliviermattelaer is this something you would expect because of problems covering the phase space with large vector sizes? Or does this sound like a bug?
I would not expect an issue at the cross-section level (more at the distribution level) when setting large vector size (which is issue that the "channelId" branch is fixing). So this sounds to that you identify a new bug here.
@oliviermattelaer yes ickkw=1 turned on for CMS default (though not thinking about merging in this step)
So I have made some pure fortran comparison for the following script:
generate p p > l+ l- 3j
output
launch
set mmll 50
set ickkw 1
set xqcut 20
For different branch of MG5aMC (no plugin impact here, pure mg5 fortran):
So conclusions here:
cluster.f: Error. Invalid combination.
error for clustering
At line 669 of file cluster.f
Fortran runtime error: Index '-1562495544' of dimension 1 of array 'imap' below lower bound of 1
At this stage, It is not clear if:
So this sounds to that you identify a new bug here.
Ouf that does not sound good :-(
Again I might be doing something silly, but I repeated the test and I seem to see this again. Maybe @choij1589 you can also try in your setup please? Run once with vector_size=16384 and once with 32 in the runcards.
@oliviermattelaer note a few points
@valassi Sorry I missed this issue, I will come back after testing with different vector_size configurations.
Hi @valassi , I have check DY+3j and the cross sections are different, vector_size=32: 1357 \pm 1.473 pb vector_size=16384: 1369 \pm 2.333 pb
but within 5 sigma difference ~ (1369-1357)/(1.473+2.333) so not sure it's actual different xsecs like in DY+4j(as @oliviermattelaer quoted?)
Those two sounds indeed compatible with each other.
I am investigating why CMS does not see a SIMD speedup in DY+3jets ie #943
Specifically I am investigating why the 'Fortran overhead' is still so large and why it varies with SIMD flags in c++ ie #958
One of the points here, as discussed in #546, is trying to understand if vector_size has an impact on speed and particularly on the speed of the 'Fortran overhead'.
On itgold91 (Intel Gold, nproc=32, no GPU) I had initially done some tests with vector_size=16384. Now I am doing the same tests with vector_size=32. I recretaed the gridpacks (which was faster because the c++ builds were in ccache).
However, the first very surprising effect is that the cross section has varied by one order of magnitude?
@oliviermattelaer is this something you would expect because of problems covering the phase space with large vector sizes? Or does this sound like a bug?
Or, is it that this process diverges and one has to put some physics cuts? @choij1589 do you have some physics cuts in your DY+3jets?
Thanks Andrea