ikabadzhov commented 2 years ago

Previously rootbench/root/tree/dataframe/RDataFrameOpenDataBenchmarks.cxx was outdated.

There are several imporvements in https://github.com/root-project/opendata-benchmarks, which are now applied (in compiled, jitted and helping functions). Main optimizations come from removing redundant variables/operations/etc.

eguiraud commented 2 years ago

@oshadura any idea what's wrong with the CI?

oshadura commented 2 years ago

@eguiraud there is a huge timeout on actual step of execution of benchmarks (> 5h)...I will check asap

oshadura commented 2 years ago

@eguiraud I checked and it seems like the problem could be with rootbench-RoofitBinnedBenchmark which takes one hour on my laptop:

 9/20 Test  #9: rootbench-VecOpsRVec ...................   Passed   50.02 sec
      Start 10: rootbench-PyROOTBenchmarks
10/20 Test #10: rootbench-PyROOTBenchmarks .............   Passed   17.31 sec
      Start 11: rootbench-RoofitBinnedBenchmark
11/20 Test #11: rootbench-RoofitBinnedBenchmark ........   Passed  3608.97 sec
      Start 12: rootbench-RoofitUnBinnedBenchmark
12/20 Test #12: rootbench-RoofitUnBinnedBenchmark ......   Passed   12.22 sec
      Start 13: rootbench-benchAddPdf
13/20 Test #13: rootbench-benchAddPdf ..................   Passed   12.27 sec
      Start 14: rootbench-benchGauss
14/20 Test #14: rootbench-benchGauss ...................   Passed   17.01 sec
      Start 15: rootbench-benchJohnson
15/20 Test #15: rootbench-benchJohnson .................   Passed   17.04 sec
      Start 16: rootbench-RDFBenchmarks

eguiraud commented 2 years ago

Thanks @oshadura . @guitargeek @lmoneta can we reduce the runtime of that benchmark?

oshadura commented 2 years ago

@eguiraud looking logs I also noticed that there is some issue with clad (seg faults) available(?) through conda root-nightly package:

6: Test command: /home/runner/work/rootbench/rootbench/build/root/hist/hist/FitGradBenchmarks "--benchmark_out_format=csv" "--benchmark_out=rootbench-gbenchmark-FitGradBenchmarks.csv" "--benchmark_color=false"
6: Environment variables: 
6:  LD_LIBRARY_PATH=/usr/share/miniconda/envs/root-nightly/lib:
6: Test timeout computed to be: 0
6: 2022-01-06T14:36:39+00:00
6: Running /home/runner/work/rootbench/rootbench/build/root/hist/hist/FitGradBenchmarks
6: Run on (2 X 2294.69 MHz CPU s)
6: CPU Caches:
6:   L1 Data 32 KiB (x2)
6:   L1 Instruction 32 KiB (x2)
6:   L2 Unified 256 KiB (x2)
6:   L3 Unified 51200 KiB (x1)
6: Load Average: 1.01, 0.90, 0.50
6: 
6:  *** Break *** segmentation violation
6:  Generating stack trace...
 6/22 Test  #6: rootbench-FitGradBenchmarks ............***Failed    0.62 sec

eguiraud commented 2 years ago

To be reported to conda-forge/root-feedstock or to clad I guess

vgvassilev commented 2 years ago

Can you paste the stack trace here?

eguiraud commented 2 years ago

Regarding the roofit stuff, @guitargeek will take a look.

lmoneta commented 2 years ago

I have checked the roofit tests on my Ubuntu machine and I see with current master:

Test project /home/moneta/rootgit/rootbench-build/root/roofit/roofit
    Start 1: rootbench-RoofitBinnedBenchmark
1/2 Test #1: rootbench-RoofitBinnedBenchmark .....   Passed  190.48 sec
    Start 2: rootbench-RoofitUnBinnedBenchmark
2/2 Test #2: rootbench-RoofitUnBinnedBenchmark ...   Passed   13.34 sec

with 6.24:

Test project /home/moneta/rootgit/rootbench_dev-build/root/roofit/roofit
    Start 1: rootbench-RoofitBinnedBenchmark
1/2 Test #1: rootbench-RoofitBinnedBenchmark .....   Passed  178.79 sec
    Start 2: rootbench-RoofitUnBinnedBenchmark
2/2 Test #2: rootbench-RoofitUnBinnedBenchmark ...   Passed   11.78 sec

maybe an issue with the CI machine, it could be it is using an older master version that has some issue

eguiraud commented 2 years ago

it could be it is using an older master version that has some issue

Ah yes, if runtimes of the benchmark changed dramatically in the last couple of months, that can definitely be it. The CI is definitely using an older master version, the last conda nightly build is from some time ago.

guitargeek commented 2 years ago

Thanks @lmoneta for checking, that is good news! Apparently the situation in master was already fixed by this commit that was merged a few days ago:

https://github.com/root-project/root/pull/9458/commits/d6204cd14dc1c92a3aa208371e0bfe508812acf2

So my assumption that the huge increase in time was dominated by the absence of the binned fit optimization was wrong. It was this unnecessarily slow computation of the output buffer sizes that got fixed in the commit above.

I'll still have to take care of the binned fit optimization though, because the ATLAS fits still get slowed down a lot by its absence (that's why I assumed that was also the issue here).

eguiraud commented 2 years ago

@oshadura the conda nightlies are back, but it looks like I don't have sufficient permissions to re-trigger the CI job. Could you please do it (or give me permission to)?

oshadura commented 2 years ago

@eguiraud you have admin permissions :( so I am not sure what else is missing

oshadura commented 2 years ago

I can't retrigger as well, maybe because the build is very old?

oshadura commented 2 years ago

@ikabadzhov can you try to make push something in your branch please?

ikabadzhov commented 2 years ago

Reading the log, almost 6 hours were spent only waiting to run root/math/genvector/GenVectorBenchmarks, which I found here - so it seems that poblem comes from this file?

eguiraud commented 2 years ago

I think GenVectorBenchmarks is just what was being executed when the CI hit the six hour timeout, but we accumulated 6 hours of execution time with the various benchmarks above.

I couldn't find which tests are running long though, @oshadura can you tell?

P.S. I started a new build

eguiraud commented 2 years ago

244 should fix the CI timeouts

eguiraud commented 2 years ago

@ikabadzhov can you please rebase your PR on latest master?

oshadura commented 2 years ago

@eguiraud I was trying to add rebase feature in rootbench https://github.com/root-project/rootbench/pull/246 should we maybe merge it and try on this PR?

ikabadzhov commented 2 years ago

Hi, @oshadura, I saw your message. I recall that I also received this error locally. It is my mistake that I have not "pushed" these 2 lines inside the 6th benchmark:

   using ROOT::Math::PtEtaPhiMVector;
   using ROOT::VecOps::Construct;

I will now revert this merge and add the fixes.

eguiraud commented 2 years ago

thanks ivan! just add the fixes in a new PR, no need to revert the merge :smile:

oshadura commented 2 years ago

Yes please no need to revert :) also, I would like to check if CI actually testing rdf benchmarks or it is long benchmarks?

eguiraud commented 2 years ago

Btw this was still waiting for a review from me before being marked mergeable :sweat_smile: I'll do a post-merge review when I get to it...

oshadura commented 2 years ago

@eguiraud ah sorry I thought you reviewed it (I didn't notice)

root-project / rootbench

Update to the latest RDataFrameOpenDataBenchmarks #242

244 should fix the CI timeouts