Closed mducle closed 5 months ago
4 files 106 suites 16m 7s :stopwatch: 680 tests 662 :white_check_mark: 18 :zzz: 0 :x: 1 890 runs 1 854 :white_check_mark: 36 :zzz: 0 :x:
Results for commit 46e6a864.
:recycle: This comment has been updated with latest results.
Attention: Patch coverage is 53.84615%
with 60 lines
in your changes are missing coverage. Please review.
Project coverage is 40.61%. Comparing base (
735d30a
) to head (bc69619
). Report is 2 commits behind head on master.
Files | Patch % | Lines |
---|---|---|
swfiles/sw_mex.m | 0.00% | 58 Missing :warning: |
swfiles/@spinw/spinwave.m | 96.66% | 2 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@RichardWaiteSTFC this PR is ready for review now. I've decided to implement the mex Sperp
calculation in another PR because it requires extensive changes to spinwave.m
and not just changing the swloop.cpp
file.
The new mex file uses Eigen for linear algebra and standard C++11 threads. Eigen operations are numerically slightly different from the BLAS/LAPACK/MKL routines used by Matlab and other mex files so there will be discrepancies. As such the unit tests are set to use the old routines at the moment.
The C++ code in
swloop.cpp
should be a reproduction of the Matlab code inswloop.m
(commented out) but there are some differences... In the Matlab code, the correlation functions are calculated with:But in C++ I had to use the equivalent of:
(e.g. the
zed
vector conjugate order is swapped and a plain transpose instead of an adjoint (conjugate-transpose) used for the eigenvectorsV
). This produces aSab
matrix which is the transposed or complex-conjugate of the Matlab one and so fails the unit test but would otherwise produce the wrong intensities... (and so fail the system tests).Finally, Eigen
eig
/chol
operations are serial unlike MKL so will be much slower for large system. So, the new code has a check and will use the old mex files for these systems (so ifnMagExt
is larger than 250 then it uses the old mex). This can be seen if running theBiFeO3
performance test withnExt=0.01
which givesnMagExt=768
. In this case using the newmex
files takes about 10x longer (about 90min vs 10min) than the oldmex
files. However, for theFMchain
test the newmex
file is about 10x faster than the old ones because the old files spent much more time in the non-linear algebra-parallelised part of the loop which is now parallelised.Running the tests in the
sw_mex
file, I'm also getting speed ups compared to the old mex files but not as large. For the old mex files on my i9-10885H 2.4Gz 8-core laptop I get:(note the Hermitian Mex medium model calculation is flaky and about 50% of the time takes over 2min instead of 20s).
Whereas with the new mex file, I get:
Running on a large DAaaS workspace with
nthreads=30
with the new mex I get:There are some speed-ups implemented in a related PR on powder fitting which should be ported back to this PR and then this PR should be merged and the powder fitting PR refactored.
Whilst profiling a powder fitting of an incommensurate model, the following bottlenecks were identified:
dQ
convolution in sw_instrument.m (16%)Sperp
fromSab
in sw_neutron.m (4%)Where percent in brackets is the fractional run time. The fractional runtime for
swloop
is 42%.The following needs to implemented:
sw_loop.cpp
to compute the incommensurate rotatedSab
in C++- [ ] Change(Move to new PR)sw_loop.cpp
to (optionally) computeSperp
in C++ (so avoid usingsw_neutron
for some applications, e.g. Horace andpowspec
.- [ ] For Horace, also change(Move to new PR)sw_loop.cpp
to only compute Sab for the positive eigenvalues (this is what the current spinwavefast.m does).mex
file to compute the dQ convolution. (Done in powder fitting PR. gives a 3x speed up of the dQ convolution, so it now takes 7% of total time).- [ ] (Possibly) add a new mex file to perform the(Move to new PR)sw_egrid
rebinning.