Test different multiply kernels in testsuite

OrderN / CONQUEST-release

Full public release of large scale and linear scaling DFT code CONQUEST

http://www.order-n.org/

MIT License

96 stars 25 forks source link

Test different multiply kernels in testsuite #270

Closed tkoskela closed 1 month ago

tkoskela commented 7 months ago

While working on the threading implementations of the matrix multiply kernel, I realised it would be very useful to test more of them than just the default. Unless we can converge on a single optimal implementation I think we should test at least one threaded and one non-threaded implementation, perhaps also a gemm and a non-gemm one.

It can be fairly easily implemented by adding the multiply kernel into the test matrix. If we use spack to automate the build it's just a matter of changing the variant in the spack spec. I'm a bit concious of blowing up the number of jobs in the test matrix, let's discuss this in the next meeting.

tkoskela commented 7 months ago

We decided to test all the kernels in CI and see how it does

tkoskela commented 5 months ago

I'm testing this in #292 and I've got three failures. https://github.com/OrderN/CONQUEST-release/actions/runs/7034701802

ompGemm_m looks like it's running out of memory on the runner with 2 ranks and 2 threads. ompDoji is less clear. When running multi-threaded, It's aborting in

https://github.com/OrderN/CONQUEST-release/blob/463fd9c24818fd13e343c8fedc518708716dd6b4/src/McWeeny.f90#L495-L505

We had some discussion over whether we need to maintain all these kernels. I've left the PR as draft because I wouldn't necessarily want to run all these jobs on every commit, although they do run in parallel and don't add much to the wall clock time of the CI. It feels wasteful to test and maintain code that is not going to be used and trimming down the unused kernels would be better.

davidbowler commented 5 months ago

There's an open issue for the ompGemm_m problem: #280

I agree that we should leave it as draft for now; I'll try to find time to look at ompDoji sometime...

tkoskela commented 5 months ago

I fixed the issue in ompDoji, there was an extra parallel region left over. ompGemm_m also passed in https://github.com/OrderN/CONQUEST-release/actions/runs/7035086832 :shrug: maybe I got a machine with more memory. I'm trying to investigate #280

tkoskela commented 5 months ago

I've come up with a potential solution in #292:

By default the CI workflow only runs the default multiply kernel and excludes the rest, but you can dispatch the workflow manually and set multiply_kernel_test = true to run all the multiply kernels in the matrix. Unfortunately this is difficult to test until the change is in the default branch because of how the GitHub GUI works. You used to be able to manually dispatch from branches using the CLI, but the information seems to be gone from the page so it might have been removed.

tkoskela commented 1 month ago

Completed by #292