madgraph5 / madgraph4gpu

GPU development for the Madgraph5_aMC@NLO event generator software package
30 stars 33 forks source link

Include Olivier's latest upstream changes, regenerate and run tests #793

Closed valassi closed 11 months ago

valassi commented 11 months ago

This is mainly about adding Olivier's fix for #781

I am running some tests and will merge this tomorrow if ok

(@oliviermattelaer please note: I had to also add a run_card_class=None... please review in case)

valassi commented 11 months ago

Olivier is this ok? https://github.com/madgraph5/madgraph4gpu/pull/793/commits/f8d7b1035c9a01b9d145a84b4aa6801857768957

oliviermattelaer commented 11 months ago

I would say that this is not really needed since this is part of the PR: https://github.com/madgraph5/madgraph4gpu/pull/788

But if you want to wait for the approval of that PR, then yes we can put this as a temporary solution

valassi commented 11 months ago

Hi @oliviermattelaer thanks. Sorry I had not looked at #788 yet, I did now. But there are still a few things I need to check an dchange in there.

Is it ok to merge this one in the meantime (today) and then work on #788 next week when you are at the workshop?

Thanks Andrea

oliviermattelaer commented 11 months ago

Then maybe it makes more sense to not include run_card_class at all here and change the gpucpp to not complain if this does not exists (let me do that)

oliviermattelaer commented 11 months ago

The best would be to apply this patch:

--- a/madgraph/iolibs/export_v4.py
+++ b/madgraph/iolibs/export_v4.py
@@ -480,7 +480,7 @@ class ProcessExporterFortran(VirtualExporter):
         if second_exporter:
             self.has_second_exporter = second_exporter

-        if self.has_second_exporter:
+        if self.has_second_exporter and hasattr(self.has_second_exporter, 'run_card_class'):
             with misc.TMP_variable(self, 'run_card_class', self.has_second_exporter.run_card_class):
                 self.create_run_card(matrix_elements, history)

But I'm confused about what you did to the MG5aMC branch. I would say that the operation would be 1) git switch gpucpp 2) git merge 3.x 3) apply the above patch 4) git push to gpucpp (or to another branch and do a PR) 5) update the submodule

(and apply the patch after 4 and/or after 5). I was planning to do the first four today, but I thought that you did that yesterday but looks like it is not the case. Could you indicate me what you did?

Cheers,

Olivier

valassi commented 11 months ago

But I'm confused about what you did to the MG5aMC branch.

Hi @oliviermattelaer what I did was not a merge but a cherry-pick of a single commit

git cherry-pick ac7de8d

This means I picked up your https://github.com/mg5amcnlo/mg5amcnlo/commit/ac7de8d96c15aa8638068400d53bab33b6305254

And now this appears in the other branch as https://github.com/mg5amcnlo/mg5amcnlo/commit/d7a466dd54bb2f57564f5cc674f129ebf095c969

If you do a merge, in principle it will not be picked up again...

Sorry for the confusion, I hope this clarifies.

Apart from that, I have then updated madgrpah4gpu to use that commit as submodulke.

valassi commented 11 months ago

The best would be to apply this patch:

--- a/madgraph/iolibs/export_v4.py
+++ b/madgraph/iolibs/export_v4.py
@@ -480,7 +480,7 @@ class ProcessExporterFortran(VirtualExporter):
         if second_exporter:
             self.has_second_exporter = second_exporter

-        if self.has_second_exporter:
+        if self.has_second_exporter and hasattr(self.has_second_exporter, 'run_card_class'):
             with misc.TMP_variable(self, 'run_card_class', self.has_second_exporter.run_card_class):
                 self.create_run_card(matrix_elements, history)

But I'm confused about what you did to the MG5aMC branch. I would say that the operation would be

1. git switch gpucpp

2. git merge 3.x

3. apply the above patch

4. git push to gpucpp (or to another branch and do a PR)

5. update the submodule

(and apply the patch after 4 and/or after 5). I was planning to do the first four today, but I thought that you did that yesterday but looks like it is not the case. Could you indicate me what you did?

Cheers,

Olivier

By the way, I cherry picked the one releavnt to ggttggg and #781. The patch you mention above is the one for the runcard, which I was not interested in testing or picking upo yet

valassi commented 11 months ago

PS The reason I did a cherry pick and not a merge is because I would not know which branch to merge... I prefer to let you do that. Here I just wanted to test the ggttggg patch and so I just picked that

oliviermattelaer commented 11 months ago

Does cherry-picking like that will not create un-necessery conflict? In any case, I have done the merge of 3.x with the gpucpp branch and no associated conflict was generated. And I have apply the above patch in top of it.

Andrea where is the instruction to update all the generated code? (or is the CI/CD doing that automatically?)

Olivier

valassi commented 11 months ago

Then maybe it makes more sense to not include run_card_class at all here and change the gpucpp to not complain if this does not exists (let me do that)

Thanks @oliviermattelaer - as discussed offline, I have now removed run_card_class (I have reverted an earlier change) https://github.com/madgraph5/madgraph4gpu/pull/793/commits/aae8ef15559e794283328c341b33e4703afc7642 Now this works fine, thanks to your changes in mg5amcnlo.

I have now moved to your latest mg5amcnlo which merges 3.5.2. I have regenerated all processes and I have rerun all tests and everything is fine, so I will merge this.

Andrea where is the instruction to update all the generated code? (or is the CI/CD doing that automatically?)

I have created a script epochX/cudacpp/CODEGEN/allGenerateAndCompare.sh which regenerates all 15 processes in the repo. I have now been using that routinely for a few weeks.

Anyway the CI now does a codegen from scratch, and does NOT check that the code is the same as that in the repo. So no need for you to regenerate the code and recommit it, I can do that. As you prefer

valassi commented 11 months ago

Ok the new CI tests succeed - I am merging this.

(As discussed, lets keep #788 for next week instead)