madgraph5 / madgraph4gpu

GPU development for the Madgraph5_aMC@NLO event generator software package
30 stars 32 forks source link

master_june24 add runTest (ME comparison) tests for two warps with different channel #896

Closed valassi closed 3 months ago

valassi commented 3 months ago

This is related to PR #830 and issue #765 and the ongoing work in #882.

In particular it is related to https://github.com/madgraph5/madgraph4gpu/pull/882#issue-2390983670. I assume that at some point the input.txt will allow specifying a range of iconfigs, rather than a single one, but I also assume we are not there yet in the Fortran code in #765.

Nevertheless, on the cudacpp side, I think (unless I did not see them) that we are missing simple tests of the cudacpp machinery where different channelid arrays are sent to the ME kernels. One simple idea is to extend runTest, which presently uses only no-multichannel tests, to also include two different channelids like 1 and 2 (the values are process dependent probably, but can be code generated).

Some sanity checks should also be included in the code eg about fortran warp sizes vs SIMD sizes (or actual cuda warp sizes), and about the fact that channelids in a warp are all the same. Self assigning this.

oliviermattelaer commented 3 months ago

I assume that at some point the input.txt will allow specifying a range of iconfigs, rather than a single one, but I also assume we are not there yet in the Fortran code in #765.

Just to confirm what we discussed last week and adding more details. So yes the fortran code is ready to handle more than one ifconfigs even if the input.txt is only designed to receive a single one. The point is that we can only handle configuration that are connected by symmetry since we only have a single grid.

The fortran code is actually reading the file ../symfact.dat where the information about symmetric channel is provided and the fortran code then knows the list of symmetric channel (and the associated symmetry to apply for the phase-space) that can be consider. (and if ../symfact.dat does no exists, then no symmetry is taken into account).

One side effect about this is since symfact.dat is in the PXXXXXX directory, if you run madevent from the PXXXXX directory or the PXXXXX/G1 directory for the same input.txt then you will get different event file/cross-section/...

Let me stress that this handling of symfact.dat is actually present in the official version of MG5aMC (since more than 10 years). And therefore is also present for the current master branch, which is why this branch is so important (at least when going to GPU) since for the moment the event can be bias in the current master since only one config was used in practise.

oliviermattelaer commented 3 months ago

And to state the obvious, yes dedicated test are important here. I was thinking to add some with "my" new CI but this is still waiting to be accepted... (anyway more test is good wathever the framework used)

valassi commented 3 months ago

I assume that at some point the input.txt will allow specifying a range of iconfigs, rather than a single one, but I also assume we are not there yet in the Fortran code in #765.

Just to confirm what we discussed last week and adding more details. So yes the fortran code is ready to handle more than one ifconfigs even if the input.txt is only designed to receive a single one. The point is that we can only handle configuration that are connected by symmetry since we only have a single grid.

The fortran code is actually reading the file ../symfact.dat where the information about symmetric channel is provided and the fortran code then knows the list of symmetric channel (and the associated symmetry to apply for the phase-space) that can be consider. (and if ../symfact.dat does no exists, then no symmetry is taken into account).

One side effect about this is since symfact.dat is in the PXXXXXX directory, if you run madevent from the PXXXXX directory or the PXXXXX/G1 directory for the same input.txt then you will get different event file/cross-section/...

Let me stress that this handling of symfact.dat is actually present in the official version of MG5aMC (since more than 10 years). And therefore is also present for the current master branch, which is why this branch is so important (at least when going to GPU) since for the moment the event can be bias in the current master since only one config was used in practise.

Thanks Olivier :-)

These comments are very useful, I added a link to #927 where at some point we can tune the madeven tests to try and test this. Unfortunately, for the moment what I see from my tmad tests as-is is the following ef16b66dd08c206b95248ce6a90acf1575648380

    eemumu MEK processed 16384 events across 2 channels { no-multichannel : 8192, 1 : 8192 }
    eemumu MEK processed 98304 events across 2 channels { no-multichannel : 8192, 1 : 90112 }
    ggttggg MEK processed 16384 events across 1240 channels { no-multichannel : 8192, 1 : 8192 }
    ggttggg MEK processed 98304 events across 1240 channels { no-multichannel : 8192, 1 : 90112 }
    ggttgg MEK processed 16384 events across 123 channels { no-multichannel : 8192, 112 : 8192 }
    ggttgg MEK processed 98304 events across 123 channels { no-multichannel : 8192, 112 : 90112 }
    ggttg MEK processed 16384 events across 16 channels { no-multichannel : 8192, 1 : 8192 }
    ggttg MEK processed 98304 events across 16 channels { no-multichannel : 8192, 1 : 90112 }
    ggtt MEK processed 16384 events across 3 channels { no-multichannel : 8192, 1 : 8192 }
    ggtt MEK processed 98304 events across 3 channels { no-multichannel : 8192, 1 : 90112 }
    gqttq MEK processed 16384 events across 5 channels { no-multichannel : 8192, 1 : 8192 }
    gqttq MEK processed 98304 events across 5 channels { no-multichannel : 8192, 1 : 90112 }
    heftggbb MEK processed 16384 events across 4 channels { no-multichannel : 8192, 1 : 8192 }
    heftggbb MEK processed 98304 events across 4 channels { no-multichannel : 8192, 1 : 90112 }
    smeftggtttt MEK processed 16384 events across 72 channels { no-multichannel : 8192, 1 : 8192 }
    smeftggtttt MEK processed 98304 events across 72 channels { no-multichannel : 8192, 1 : 90112 }
    susyggtt MEK processed 16384 events across 3 channels { no-multichannel : 8192, 1 : 8192 }
    susyggtt MEK processed 98304 events across 3 channels { no-multichannel : 8192, 1 : 90112 }

This means that we I am not testing different channelIds. Hence the whole channelId infrastructure would not be tested if I relied only on that. Maybe by running other processes in user mode at some point you run multichanneles in madevent, but I think we should be sure of that (again, to be discussed in 927, not here).

My point HERE was that, in addition to madevent or instead of madevent, I think that it is necessary (and I would have hoped this to be done in 830) that specific test are designed for instance in runTests. This is what I have done, essentially here

 git log --oneline | grep \#896
[not the full list]
7cbdf41aa [june24] in CODEGEN and gg_tt.mad runTest.cc, modify the multichannel test #896 to use channels 1,2,3,1,2,3... for different WARPS of 32 events
0f82718e9 [june24] in CODEGEN and gg_tt.mad, create new txt2 ref #896 and recreate txt ref for runTest (use cuda/double as the reference platform)
5d2c26cef [june24] in CODEGEN (backport gg_tt.mad) runTest.cc, modify the multichannel test #896 to use channels 1,2,3,1,2,3... for different events (previously it was 1 for all events)
d501f457a [june24] in CODEGEN (backport gg_tt.mad) runTest.cc, add two tests with/without multichannel #896; use <file.txt> as ref without multichannel and <file.txt2> as ref with multichannel
b66195085 [gtest2/june24] in CODEGEN (backport gg_tt.mad) MadgraphTest.h, runTest.cc, testxxx.cc: simplify gtest templates, remove cudaDeviceReset to fix #907, complete preparation of two-test infrastructure #896

So essentially here 7cbdf41aa 0f82718e9 5d2c26cef d501f457a b66195085

Ah somewhere in those commits, or maybe elsewhere, is the instrumentation of MEK to produce some debug printouts. This is a development feature only, with a hiddden flag, ah it is here 969350b10

969350b10 [june24] in CODEGEN (from gg_tt.mad) MEK/cudacpp.mk/mgOnGpuConfig.h, add channelid debug printouts if the code is compiled with 'make MG5AMC_CHANNELID_DEBUG=1'

This prints the following in the runTest 'tput' suite b4e9cab79bc4bae09c33dba7cb48f697a9a45166

    eemumu MEK (channelid array) processed 512 events across 2 channels { 1 : 256, 2 : 256 }
    eemumu MEK (no multichannel) processed 512 events across 2 channels { no-multichannel : 512 }
    ggttggg MEK (channelid array) processed 512 events across 1240 channels { 1 : 32, 2 : 32, 4 : 32, 5 : 32, 7 : 32, 8 : 32, 14 : 32, 15 : 32, 16 : 32, 18 : 32, 19 : 32, 20 : 32, 22 : 32, 23 : 32, 24 : 32, 26 : 32 }
    ggttggg MEK (no multichannel) processed 512 events across 1240 channels { no-multichannel : 512 }
    ggttgg MEK (channelid array) processed 512 events across 123 channels { 2 : 32, 3 : 32, 4 : 32, 5 : 32, 6 : 32, 7 : 32, 8 : 32, 9 : 32, 10 : 32, 11 : 32, 12 : 32, 13 : 32, 14 : 32, 15 : 32, 16 : 32, 17 : 32 }
    ggttgg MEK (no multichannel) processed 512 events across 123 channels { no-multichannel : 512 }
    ggttg MEK (channelid array) processed 512 events across 16 channels { 1 : 64, 2 : 32, 3 : 32, 4 : 32, 5 : 32, 6 : 32, 7 : 32, 8 : 32, 9 : 32, 10 : 32, 11 : 32, 12 : 32, 13 : 32, 14 : 32, 15 : 32 }
    ggttg MEK (no multichannel) processed 512 events across 16 channels { no-multichannel : 512 }
    ggtt MEK (channelid array) processed 512 events across 3 channels { 1 : 192, 2 : 160, 3 : 160 }
    ggtt MEK (no multichannel) processed 512 events across 3 channels { no-multichannel : 512 }
    gqttq MEK (channelid array) processed 512 events across 5 channels { 1 : 128, 2 : 96, 3 : 96, 4 : 96, 5 : 96 }
    gqttq MEK (no multichannel) processed 512 events across 5 channels { no-multichannel : 512 }
    heftggbb MEK (channelid array) processed 512 events across 4 channels { 1 : 128, 2 : 128, 3 : 128, 4 : 128 }
    heftggbb MEK (no multichannel) processed 512 events across 4 channels { no-multichannel : 512 }
    smeftggtttt MEK (channelid array) processed 512 events across 72 channels { 1 : 32, 2 : 32, 3 : 32, 4 : 32, 5 : 32, 6 : 32, 7 : 32, 8 : 32, 9 : 32, 10 : 32, 11 : 32, 12 : 32, 13 : 32, 14 : 32, 15 : 32, 16 : 32 }
    smeftggtttt MEK (no multichannel) processed 512 events across 72 channels { no-multichannel : 512 }
    susyggt1t1 MEK (channelid array) processed 512 events across 6 channels { 2 : 128, 3 : 96, 4 : 96, 5 : 96, 6 : 96 }
    susyggt1t1 MEK (no multichannel) processed 512 events across 6 channels { no-multichannel : 512 }
    susyggtt MEK (channelid array) processed 512 events across 3 channels { 1 : 192, 2 : 160, 3 : 160 }
    susyggtt MEK (no multichannel) processed 512 events across 3 channels { no-multichannel : 512 }

This shows that this #896 is complete.

(And IN ADDITION, by the way, I have added selected color/helicity comparison in runTest in #925 (jeavily debugged for mixed mode in #924)

Fixed in #882. Linking it there, closing as complete.

valassi commented 3 months ago

reopened for linking, cloing again