Closed fishjojo closed 4 months ago
@fishjojo bas_ij
and bas_kl
in CUDA kernels are the global indices in each contraction pattern, not the local indices in each bin. So offsets.log_q_ij
and offsets.log_q_kl
are also should be zero-based correspondingly.
I see, I didn't notice bas_kl
already has the offset included. Thanks for the explanation.
Actually log_q_kl
uses the local index.
https://github.com/pyscf/gpu4pyscf/blob/761f85ff22cf04d62167451dbf82df48caefb2e8/gpu4pyscf/lib/gvhf/g2e.cu#L64
This is what my original question about.
Actually
log_q_kl
uses the local index.This is what my original question about.
This could be a bug.
@fishjojo The bug has been fixed.
Thanks. I think screening at the shell level is worth more investigation. The density matrix screening may also need more tests.
Thanks. I think screening at the shell level is worth more investigation. The density matrix screening may also need more tests.
I agree with the first point. The density matrix screening is quite classical, right?
I feel this https://github.com/pyscf/gpu4pyscf/blob/c58c7d57e61aec029d2f07a433a8be823656c14e/gpu4pyscf/scf/hf.py#L135 is a bit unconventional.
I feel this
is a bit unconventional.
Well, it is certainly not the most elegant implementation. But the strategy is not something new. https://manual.q-chem.com/5.4/Ch5.S5.SS6.html
Maybe it's useful to add a check whether the input density is the density difference and 1e3
could be customizable.
This line https://github.com/pyscf/gpu4pyscf/blob/761f85ff22cf04d62167451dbf82df48caefb2e8/gpu4pyscf/lib/gvhf/nr_jk_driver.cu#L184 It seems to me should be inside the following loop https://github.com/pyscf/gpu4pyscf/blob/761f85ff22cf04d62167451dbf82df48caefb2e8/gpu4pyscf/lib/gvhf/nr_jk_driver.cu#L187-L188 and be
Maybe I missed something?