Closed gitpeterwind closed 3 years ago
@stigrj , What was missing in the exchangeD1 for full functionality of the not-on-fly setup?
I don't remember. Is it not working?
You said a month ago that the non-diagonal terms were not computed
It is working for normal HF, but in a commit this summer PR you switched it off for some reason. For example what is "dagger"?
Check this discussion: https://github.com/gitpeterwind/mrchem/pull/2
Seems the problem was mixing several OrbitalVectors in the bank at the same time, but this should be seamless now with the new BankAccounts. dagger()
is the adjoint operator. For response (D2) this means swapping x_i
and y_i
terms:
K_1.apply(phi_p) = sum_i x_i * P[phi_i.dagger() * phi_p] +
sum_i phi_i * P[ y_i.dagger() * phi_p]
K_1.dagger(phi_p) = sum_i phi_i * P[ x_i.dagger() * phi_p] +
sum_i y_i * P[phi_i.dagger() * phi_p]
Ah, thanks maybe I start to understand: K.dagger is actually more "K.dagger.apply"?
The test fails, but the results difference are within world_prec. The new code will compute the exchange terms in a different order, so it is not so surprising. Does this mean I should update the test results?
I can have a look at the tests. Is this otherwise ready?
Will the exchange terms be computed in arbitrary (not reproducible) order?
When run on one MPI, it will be reproducible. (it is very hard to make a code that is not!). What is there is ready, but I thought having a further look at this dagger.
Is it possible to force it reproducible with numerically_exact = true
?
You mean that it will be the same, even with many mpi? (If it is useful I could try, but it means a very different approach: for example forcing one mpi to do all the work, and things like that)
It has proved useful in the past, but it has to make sense also in the code. The changes in the HF energies seems to have been to the better, so I will update the test references.
Merging #364 (3b67498) into master (4dd979d) will increase coverage by
0.35%
. The diff coverage is65.56%
.
@@ Coverage Diff @@
## master #364 +/- ##
==========================================
+ Coverage 69.18% 69.54% +0.35%
==========================================
Files 171 172 +1
Lines 13307 13457 +150
==========================================
+ Hits 9207 9358 +151
+ Misses 4100 4099 -1
Impacted Files | Coverage Δ | |
---|---|---|
src/parallel.cpp | 80.43% <ø> (-2.18%) |
:arrow_down: |
src/utils/Bank.cpp | 32.35% <41.93%> (+4.69%) |
:arrow_up: |
src/mrenv.cpp | 97.54% <50.00%> (ø) |
|
...c/qmoperators/two_electron/ExchangePotentialD1.cpp | 70.56% <64.00%> (+42.64%) |
:arrow_up: |
src/qmfunctions/orbital_utils.cpp | 67.06% <72.32%> (-0.58%) |
:arrow_down: |
src/driver.cpp | 70.35% <100.00%> (+0.04%) |
:arrow_up: |
src/qmoperators/two_electron/ExchangePotential.cpp | 97.05% <100.00%> (+8.65%) |
:arrow_up: |
src/utils/Bank.h | 100.00% <100.00%> (ø) |
|
tests/qmfunctions/orbital_vector.cpp | 100.00% <100.00%> (ø) |
|
src/qmfunctions/qmfunction_utils.cpp | 87.91% <0.00%> (-6.05%) |
:arrow_down: |
... and 7 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 4dd979d...3b67498. Read the comment docs.
I was certainly wrong about how complicated it was to include the "numerically exact" flag, one extra line :)
I get this error if I run with the "old" setup --nodes=1 --ntasks-per-node=8 --cpus-per-task=16
and mpirun --map-by numa --bind-to numa
on Betzy:
Error: /cluster/home/stig/src/mrchem/src/utils/Bank.cpp: openAccount(), line 607: Account id mismatch!
Could you give me your .inp file?
/cluster/projects/nn4654k/stig/benzene.inp
The id bug was when bank_size=2. In that case the task manager would be run by one of (2) the bank in order not to occupy half of the banks, but that was not tested properly.
For H2O there are no real differences between the precomputed and non-precomputed results. but for Benzene there is a difference is due to different approximation in the integral in the non-symmetric case. I have now forced the two to give the same results.
I think it is ready now
Exchange (saved, i.e. not on the fly) is twice as fast for large systems. Improvement in the parallelization setup:
More upgrade of Bank:
An example of setup for Betzy, where the bank processes do not "occupy" a full worker process, but only one core instead. This allows to better take advantage of all the cores on a node.