Further fixes/improvements for iconfig-channel mappings in coloramps.h

valassi commented 1 week ago

Hi @oliviermattelaer this is a modified version of #873, to be compared to your fix_826. It is built on top of your fix_826 PR #852, so it has by construction no conflicts with that. (So it can be compared to fix_826 as you suggested in https://github.com/madgraph5/madgraph4gpu/pull/873#issuecomment-2198147232)

As discussed offline I would suggest to agree upfront on this one and on https://github.com/mg5amcnlo/mg5amcnlo/pull/116, and then:

merge https://github.com/mg5amcnlo/mg5amcnlo/pull/116 (this is needed here in the MG5amC submodule!)
merge your fix_826
merge this PR

Thanks! Andrea

valassi commented 1 week ago

Specifically on what is different from your fix_826, see https://github.com/madgraph5/madgraph4gpu/pull/873#issuecomment-2198193970

I think it's best to see at generated code rather than code generator

-  // Map channelId to iconfigC
-  // This array has N_diagrams+1 elements, but only N_config <= N_diagrams valid values
-  // unvalid values are set to -1
-  // The 0 entry is a fall back to still write events even if no multi-channel is setup (wrong color selected in that mode) 
-    __device__ constexpr int channelId_to_iconfigC[7] = {
-     0, // channelId=0: This value means not multi-channel, color will be wrong anyway -> pick the first
-     -1, // channelId=1 (diagram=1): Not consider as a channel of integration (presence of 4 point interaction?)
-     0, // channelId=2 (diagram=2) --> iconfig=1 (f77 conv) and iconfigC=0 (c conv)
-     1, // channelId=3 (diagram=3) --> iconfig=2 (f77 conv) and iconfigC=1 (c conv)
-     2, // channelId=4 (diagram=4) --> iconfig=3 (f77 conv) and iconfigC=2 (c conv)
-     3, // channelId=5 (diagram=5) --> iconfig=4 (f77 conv) and iconfigC=3 (c conv)
-     4, // channelId=6 (diagram=6) --> iconfig=5 (f77 conv) and iconfigC=4 (c conv)
...
+  // Map channelIdC (in C indexing, i.e. channelId-1) to iconfig (in F indexing)
+  // Note: iconfig=0 indicates invalid values, i.e. channels/diagrams with no single-diagram enhancement in the MadEvent sampling algorithm (presence of 4-point interaction?)
+  // This array has N_diagrams elements, but only N_config <= N_diagrams valid (non-zero) values
+  __device__ constexpr int channelIdC_to_iconfig[6] = { // note: a trailing comma in the initializer list is allowed
+    0, // CHANNEL_ID=1 i.e. DIAGRAM=1 --> ICONFIG=None
+    1, // CHANNEL_ID=2 i.e. DIAGRAM=2 --> ICONFIG=1
+    2, // CHANNEL_ID=3 i.e. DIAGRAM=3 --> ICONFIG=2
+    3, // CHANNEL_ID=4 i.e. DIAGRAM=4 --> ICONFIG=3
+    4, // CHANNEL_ID=5 i.e. DIAGRAM=5 --> ICONFIG=4
+    5, // CHANNEL_ID=6 i.e. DIAGRAM=6 --> ICONFIG=5

There are two (related) main differences

As shown above, the mapping array has Ndiag in my case and Ndiag+1 in yours. I find it clearer, because there are are actually only Ndiag mappings: adding one additional one here would only be because C starts with indexes at 0, and I think it is better to handle that in the code (one more line of code) rather than in the array (one more line in the array).
The output of the mapping is the Fortran iconfig in my case and is the C iconfigC in yours. I find it clearer to have an array which only manipulates the fortran numbers, and then again have one extra line of code to handle the iconfigF to iconfigC mapping.

That said, I will not make a fuss on this. If you prefer your way, lets do that. (I would just change some comments accordingly). Let me know please.

Thanks Andrea

oliviermattelaer commented 1 week ago

Thanks for this PR. This is really great to move forward.

I have try to not remember here what was "my" proposal from yours. But this issue is for sure subjective and correspond to "internal" way of thinking and therefore this exercise was too bias. I think that they are not objectively good/bad choice and that we have to agree to disagree on which one is the best.

By looking only at the above diff, I do have a strong preference to have that channelId has "key" (fortran style). Two reasons:

First, it has a line for the "no multi-channel" which clarify the handle of the case if someone runs that mode --not default but nice for debugging the cross-section, I do worry that it has undefinied behaviour
Second, I do not think that we use "CHANNEL_IDC" anywhere in the code (at least so far, except for this call) and avoiding that variable will avoid confusion about the meaning. (when checking that I found a file "mypatch" which might destroy this argument, but I think that we can remove that file)

Independently, if we pick your convention, for that part one should change the comment from "CHANNEL_ID" to CHANNELIDC. Second point, using "C" here might not be the best idea since one can think this is for color ... maybe use _Cord or _Cpp will be better.

For the output of the dictionary, I do not have any strong feeling for returning iconfig_cpp or iconfig_f77. But in any case, I would avoid value "0" for invalid mapping (preferring "-1" or "-99") to make clear this is invalid and not an index.

So would you agree on this:

-  // Map channelId to iconfig
-  // This array has N_diagrams+1 elements, but only N_config <= N_diagrams valid values
-  // unvalid values are set to -1
-  // The 0 entry is a fall back to still write events even if no multi-channel is setup (wrong color selected in that mode) 
-    __device__ constexpr int channelId_to_iconfigC[7] = {
-     1, // channelId=0 (i.e. no multi-channel) color will be wrong anyway -> pick the first config
-     -1, // channelId=1 (i.e. diagram=1): Not consider as a channel of integration (presence of 4 point interaction?)
-     1, // channelId=2 (i.e. diagram=2) --> iconfig=1 (f77 conv) 
-     2, // channelId=3 (i.e. diagram=3) --> iconfig=2 (f77 conv) 
-     3, // channelId=4 (i.e. diagram=4) --> iconfig=3 (f77 conv) 
-     4, // channelId=5 (i.e. diagram=5) --> iconfig=4 (f77 conv) 
-     5, // channelId=6 (i.e. diagram=6) --> iconfig=5 (f77 conv)

I obviously volunteer to implement the change in the code to arrive to this situation if you agree (but you can also do it if you prefer obviously). Thanks a lot for this PR that clarify this issue.

valassi commented 1 week ago

Hi Olivier, thanks a lot :-)

For the output of the dictionary, I do not have any strong feeling for returning iconfig_cpp or iconfig_f77. But in any case, I would avoid value "0" for invalid mapping (preferring "-1" or "-99") to make clear this is invalid and not an index.

This I have no problem with, or acually I even prefer it, it sounds like a good idea to me! So this part is fine

     -1, // channelId=1 (i.e. diagram=1): Not consider as a channel of integration (presence of 4 point interaction?)
      1, // channelId=2 (i.e. diagram=2) --> iconfig=1 (f77 conv) 
      2, // channelId=3 (i.e. diagram=3) --> iconfig=2 (f77 conv) 
      3, // channelId=4 (i.e. diagram=4) --> iconfig=3 (f77 conv) 
      4, // channelId=5 (i.e. diagram=5) --> iconfig=4 (f77 conv) 
      5, // channelId=6 (i.e. diagram=6) --> iconfig=5 (f77 conv)

First, it has a line for the "no multi-channel" which clarify the handle of the case if someone runs that mode --not default but nice for debugging the cross-section, I do worry that it has undefinied behaviour

This instead I really do not understand?

First, if ever it was possible that someone runs these lines of code with channel_id=0, then actually the code would compute channel_idC=-1 and the lookup would crash. But I see your point that we should ensure that such a crash does not happen...

...Second, however, I am pretty sure that this will not happen. Indeed I checked the ode and I think that such a crash cannot happen. The relevant code is this (now in color2 as-is):

#ifdef MGONGPU_SUPPORTS_MULTICHANNEL
    // Event-by-event random choice of color #402
    if( channelId != 0 ) // no event-by-event choice of color if channelId == 0 (fix FPE #783)
    {
      const unsigned int channelIdC = channelId - 1;                           // channelIdC_to_iconfig in coloramps.h uses the C array indexing starting at 0
      const unsigned int iconfig = mgOnGpu::channelIdC_to_iconfig[channelIdC]; // map N_diagrams to N_config <= N_diagrams configs (fix LHE color mismatch #856: see also #826, #852, #853)
      const unsigned int iconfigC = iconfig - 1;                               // icolamp in coloramps.h uses the C array indexing starting at 0
      fptype targetamp[ncolor] = { 0 };
      for( int icolC = 0; icolC < ncolor; icolC++ )
      {
        if( icolC == 0 )
          targetamp[icolC] = 0;
        else
          targetamp[icolC] = targetamp[icolC - 1];
        if( mgOnGpu::icolamp[iconfigC][icolC] ) targetamp[icolC] += jamp2_sv[icolC];
      }

The relevant line here is

    if( channelId != 0 ) // no event-by-event choice of color if channelId == 0 (fix FPE #783)

My point is that there is already a protection bypassing random choice of color if channelid=0! I had even forgotten why this is there, but as mentioned in the code I added this to fix a crash from SIGFPE in #783.

So I do not agree with your first motivation about adding this first line

     1, // channelId=0 (i.e. no multi-channel) color will be wrong anyway -> pick the first config

Second, I do not think that we use "CHANNEL_IDC" anywhere in the code (at least so far, except for this call) and avoiding that variable will avoid confusion about the meaning. (when checking that I found a file "mypatch" which might destroy this argument, but I think that we can remove that file)

Well my point is that using C arrays forces us to use C indexing, starting at 0!

If you really prefer to avoid introducing a "channelIdC" variable channelIdC = channelId - 1 then I am ok to replace the two lines

      const unsigned int channelIdC = channelId - 1;                           // channelIdC_to_iconfig in coloramps.h uses the C array indexing starting at 0
      const unsigned int iconfig = mgOnGpu::channelIdC_to_iconfig[channelIdC]; // map N_diagrams to N_config <= N_diagrams configs (fix LHE color mismatch #856: see also #826, #852, #853)

by a single line

      const unsigned int iconfig = mgOnGpu::channelIdC_to_iconfig[channelId - 1]; // map N_diagrams (with C indexing channelId - 1) to N_config <= N_diagrams configs (fix LHE color mismatch #856: see also #826, #852, #853)

even if honestly I find it clearer in two lines.

Conversely I find it really confusing/misleading/wrong to add one element in an array for a channel that does not exist. (Why are we not doing that for icolamp too then? because it would be really ugly...).

Another option I can propose (which sounds like an overkill but may be clearer) is to replace the mapping array channelIdc_to_iconfig by a constexpr mapping function channelId_to_iconfig which uses fortran indexing as inputs and outputs. Does that sound better?

"no multi-channel" which clarify the handle of the case if someone runs that mode

One last point about 'no multi channel', also for myself to remember: note that there are two cases,

code with no multichannel mode, using #undef MGONGPU_SUPPORTS_MULTICHANNEL: in this case (e.g. all .sa directories test this) the coloramps.h is not even needed... I will add a protection for this in the code
code with multichannel mode, using #define MGONGPU_SUPPORTS_MULTICHANNEL 1: in this case channelId=0 can still exist, in particular in sigmaKin_getGoodHel which only computes good helicities but does not handle individual event colors, and it is precisely in this case that the protection above if( channelId != 0 ) // no event-by-event choice of color if channelId == 0 (fix FPE #783) is needed

I obviously volunteer to implement the change in the code to arrive to this situation if you agree (but you can also do it if you prefer obviously).

Thanks for the offer, but I prefer to do it :-) I will make a few changes and we can then rediscuss

Thanks again

valassi commented 1 week ago

Hi @oliviermattelaer , voila I made a few changes, I hope this makes progress.

In summary

I checked that coloramps.h is already not included unless MGONGPU_SUPPORTS_MULTICHANNEL is defined, but I added a protection inside the code anyway with '#ifdef MGONGPU_SUPPORTS_MULTICHANNEL', which makes it clearer and does not harm
I moved as you suggested from iconfig=0 to iconfig=-1 to indicate 'invalid' values, or rather channels/diagrams which have no associated iconfig for single-diagram enhancement
I kept my preference of not having an extra element in the array for "channelId=0'
however, I completely removed 'channelIdC' and 'iconfigC' from the code, and I went back to a 1-line algorithm from my previous 3-line algorithm: what I changed is that "channelId-1" and "iconfig-1" are used ONLY at the time of calling a C array index, e.g. channel2iconfig[channelId-1] (ah yes I also renamed the array slightly)
(finally for consistency I explained why I use "icolC" instead of "icol", because also here there is a difference between C and Fortran indexing)

I hope that sounds like a good compromise? In any case I think that it looks better than both my and you roriginal suggestions, which is positive.

To check the differences in generated code bwteen your original suggestion and the current version, you can check the diffs between

git diff --no-ext-diff 22328317 5b52790f susy_gg_t1t1.mad

This gives for instance

     // Event-by-event random choice of color #402
     if( channelId != 0 ) // no event-by-event choice of color if channelId == 0 (fix FPE #783)
     {
-      const unsigned int iconfigC = mgOnGpu::channelId_to_iconfigC[channelId]; // coloramps.h uses a channel ordering not the diagram id
+      // NB (see #877): in the array channel2iconfig, the input index uses C indexing (channelId -1), the output index uses F indexing (iconfig)
+      const unsigned int iconfig = mgOnGpu::channel2iconfig[channelId - 1]; // map N_diagrams to N_config <= N_diagrams configs (fix LHE color mismatch #856: see also #826, #852, #853)
       fptype targetamp[ncolor] = { 0 };
+      // NB (see #877): explicitly use 'icolC' rather than 'icol' to indicate that icolC uses C indexing in [0, N_colors-1]
       for( int icolC = 0; icolC < ncolor; icolC++ )
       {
         if( icolC == 0 )
           targetamp[icolC] = 0;
         else
           targetamp[icolC] = targetamp[icolC - 1];
-        if( mgOnGpu::icolamp[iconfigC][icolC] ) targetamp[icolC] += jamp2_sv[icolC];
+        // NB (see #877): in the array icolamp, the input index uses C indexing (iconfig -1)
+        if( mgOnGpu::icolamp[iconfig - 1][icolC] ) targetamp[icolC] += jamp2_sv[icolC];
       }

and

-  // Map channelId to iconfigC
-  // This array has N_diagrams+1 elements, but only N_config <= N_diagrams valid values
-  // unvalid values are set to -1
-  // The 0 entry is a fall back to still write events even if no multi-channel is setup (wrong color selected in that mode) 
-    __device__ constexpr int channelId_to_iconfigC[7] = {
-     0, // channelId=0: This value means not multi-channel, color will be wrong anyway -> pick the first
-     -1, // channelId=1 (diagram=1): Not consider as a channel of integration (presence of 4 point interaction?)
-     0, // channelId=2 (diagram=2) --> iconfig=1 (f77 conv) and iconfigC=0 (c conv)
-     1, // channelId=3 (diagram=3) --> iconfig=2 (f77 conv) and iconfigC=1 (c conv)
-     2, // channelId=4 (diagram=4) --> iconfig=3 (f77 conv) and iconfigC=2 (c conv)
-     3, // channelId=5 (diagram=5) --> iconfig=4 (f77 conv) and iconfigC=3 (c conv)
-     4, // channelId=6 (diagram=6) --> iconfig=5 (f77 conv) and iconfigC=4 (c conv)
+  // Map channel to iconfig (e.g. "iconfig = channel2iconfig[channelId - 1]": input index uses C indexing, output index uses F indexing)
+  // Note: iconfig=-1 indicates channels/diagrams with no associated iconfig for single-diagram enhancement in the MadEvent sampling algorithm (presence of 4-point interaction?)
+  // This array has N_diagrams elements, but only N_config <= N_diagrams valid values (iconfig>0)
+  __device__ constexpr int channel2iconfig[6] = { // note: a trailing comma in the initializer list is allowed
+    -1, // CHANNEL_ID=1 i.e. DIAGRAM=1 --> ICONFIG=-1 (diagram with no associated iconfig for single-diagram enhancement)
+     1, // CHANNEL_ID=2 i.e. DIAGRAM=2 --> ICONFIG=1
+     2, // CHANNEL_ID=3 i.e. DIAGRAM=3 --> ICONFIG=2
+     3, // CHANNEL_ID=4 i.e. DIAGRAM=4 --> ICONFIG=3
+     4, // CHANNEL_ID=5 i.e. DIAGRAM=5 --> ICONFIG=4
+     5, // CHANNEL_ID=6 i.e. DIAGRAM=6 --> ICONFIG=5
   };

Do you think this is ok now? Thanks Andrea

(By the way, please do not forget to review also https://github.com/mg5amcnlo/mg5amcnlo/pull/116... this #877 contains a reference to a commit in that PR, so we should merge that first if possible)

valassi commented 1 week ago

Hi @oliviermattelaer thanks a lot! Ok I think we finally converged :-)

I accepted your suggestions, and then added a few more fixes/tweaks in the same direction

Ok for removing NNNN and using '{0}': nice trick, I did not think of that. Indeed better than NNNN. I also similarly removed another NNNN in another line for the formatting of the second array (I think you would approve).
Of for going back to 'is_LC': I do not understand the naming, but I do not care at all. I had to also apply a small fix to do the same elsewhere (otherwise code generation was failing).

I have regenerated all processes. I will not run manual tests as I already did. (I would hope the CI to do tests but it did not start, I am not sure why, I will check).

There is still one point pending: can you please review https://github.com/mg5amcnlo/mg5amcnlo/pull/116? This is used in the code in this PR #877, so I would like it to be validated.

Afterwards when all is set I would suggest:

I merge https://github.com/mg5amcnlo/mg5amcnlo/pull/116
I merge your fix_826 #852
I change the target of this #877 from fix_826 to master and I merge it there

Is this ok? Thanks Andrea

valassi commented 1 week ago

This will fix #856 when merged (together with https://github.com/mg5amcnlo/mg5amcnlo/pull/116)

I will close the older color branch #873

valassi commented 1 week ago

I changed the target branch to upstream/master. Hopefully this should trigger the CI?

In any case this must be merged into master, once fix_826 PR #852 has been also merged.

valassi commented 1 week ago

I had to force push the last-1 an dthen the last version again, but the CI tests finally executed and are ok

I will in any case add a small tweak in the CI config, to make it clearer which commit is being tested (not clear now)

oliviermattelaer commented 1 week ago

Concerning https://github.com/mg5amcnlo/mg5amcnlo/pull/116? I do not think that this PR should include the change of that PR.

The two fixes different issues and we should not merge simultaneously with the github history. So I guess that while https://github.com/mg5amcnlo/mg5amcnlo/pull/116 is fine (up to some cleaning), it is not fine to refer to it within this PR. Could you change the commit associated to this PR to a commit that does not contain it (gpucpp should be fine).

Cheers,

Olivier

valassi commented 1 week ago

Could you change the commit associated to this PR to a commit that does not contain it (gpucpp should be fine).

Thanks Olivier, but the answer is no.

The generated code in this PR was generated using https://github.com/mg5amcnlo/mg5amcnlo/pull/116, so that it can fix #856. I would really like to not have to go back.

Let me rephrase: do you see a problem with https://github.com/mg5amcnlo/mg5amcnlo/pull/116 or is it just that you consider them different issues somehow?

If you see a problem in https://github.com/mg5amcnlo/mg5amcnlo/pull/116, then we have a (very big) problem...
If you do not see a problem, can we just merge these three PRs https://github.com/mg5amcnlo/mg5amcnlo/pull/116 #852 #877 and move on? Indipendently of what PR fixes what issue (we still have different ideas on that), if we both agree that all three are needed, can we just put them in and move on?

Thanks Andrea

valassi commented 1 week ago

The generated code in this PR was generated using mg5amcnlo/mg5amcnlo#116, so that it can fix #856. I would really like to not have to go back.

Specifically

git diff --no-ext-diff upstream/master color2 susy_gg_t1t1.mad/SubProcesses/P1_gg_t1t1x/coloramps.h
...
-  __device__ constexpr bool icolamp[5][2] = {
-    { true, true },
-    { true, true },
-    { true, false },
-    { true, false },
-    { false, true }
...
+  __device__ constexpr bool icolamp[5][2] = { // note: a trailing comma in the initializer list is allowed
+    {  true,  true }, // ICONFIG=1  <-- CHANNEL_ID=2
+    {  true, false }, // ICONFIG=2  <-- CHANNEL_ID=3
+    {  true, false }, // ICONFIG=3  <-- CHANNEL_ID=4
+    { false,  true }, // ICONFIG=4  <-- CHANNEL_ID=5
+    { false,  true }, // ICONFIG=5  <-- CHANNEL_ID=6
   };

The upstream/master is wrong, instead the version here in color2 is right because of mg5amcnlo/mg5amcnlo#116

I really do not see the point of generating the code with the wrong version "true,true;true,true;..." when we have the right version "true,true;true,false;..."

valassi commented 1 week ago

Olivier, I propose the following:

tomorrow I will move the fix from https://github.com/mg5amcnlo/mg5amcnlo/pull/116 into cudacpp: this is just overloading a method from export_cpp, I do this all the time (but in this case I thought it better to modify upstream mg5amcnlo directly)
if I do that, then I wil lmove back MG5AMC to another version without the 116 fix
but I will still get the same generated code fixing #856

I hope this is another acceptable compromise? I will do this tomorrow

Thanks Andrea

PS And then, if as you suspect the fix in 116 is not the whole story, we will make another bug fix in cudacpp, until this is proven to be ok to be moved upstream in mg5amcnlo (I have no hurry in any case as long as the fix is in cudacpp and in the code it generates)

valassi commented 1 week ago

Moving back to draft until I move the 116 patch to cudacpp and downgrade MG5AMC

valassi commented 1 week ago

Could you change the commit associated to this PR to a commit that does not contain it (gpucpp should be fine).

I have now done this, so that we can finally merge this and PR #852.

For the moment I moved back to https://github.com/mg5amcnlo/mg5amcnlo/commit/09c96dd171d6f61736cb51cc767087ab73b9d579. I will later move to the latest gpucpp commit https://github.com/mg5amcnlo/mg5amcnlo/commit/1e2aa4bc3b19b53de2249b82b06f834299c4a23e as suggested by Olivier.

I have removed the association to #856 because THIS PR WILL NOT FIX #856. It contains a fix (moved to cudacpp), but then also its reverting.

The CI test shows 9 errors in https://github.com/madgraph5/madgraph4gpu/actions/runs/9776063937

826 no cross section in susy
872 cross section mismatch in pp_tt012j
856 LHE mismatch in ggttgg (new/again, because the fix has been removed)

(I also suspect more errors with LHE files similar to 856 would show up if running on different iconfig).

As suggested by Olivier I will include my suggested fix for 856 in my own master_valassi fork, which I will then need to create.

I will add back the bypass for 856, so that the CI checks for new issues only.

valassi commented 1 week ago

I will later move to the latest gpucpp commit mg5amcnlo/mg5amcnlo@1e2aa4b as suggested by Olivier.

I have now moved to the latest gpucpp commit as suggested by Olivier. (Again, this does NOT include the fix in 116 instead: it does include the full fix for #856).

Let me approve this already and then check the associated PR for the other repo.

I have merged PR #852 "fix_826". Since Olivier approved this (an earlier version - I believe this has the further improvements he required) I will now merge this one as well, with further fixes/improvements.

So I guess that while mg5amcnlo/mg5amcnlo#116 is fine (up to some cleaning), it is not fine to refer to it within this PR. Could you change the commit associated to this PR to a commit that does not contain it (gpucpp should be fine).

Specifically, this is now done.

Thanks a lot again Olivier for all the patient discussion on this Merging. Andrea

valassi commented 1 week ago

(PS The CI for the version I finally merged is green because it contains bypasses for three known issues #826 #872 and #856)

madgraph5 / madgraph4gpu

Further fixes/improvements for iconfig-channel mappings in coloramps.h #877

826 no cross section in susy

872 cross section mismatch in pp_tt012j

856 LHE mismatch in ggttgg (new/again, because the fix has been removed)