Open khuddzu opened 1 year ago
i saw the example in denspart/examples/horton3 and ran it with a single fchk file and got the error that resutls.npz didnt exist. does the program not make this file for you?
There may be several causes. Did you encounter this issue with the file included in the repository?
It seems the first command in the following script must have failed: https://github.com/theochem/denspart/blob/main/examples/horton3/runall.sh
What happens exactly when you run denspart-from-horton3 h2o_sto3g.fchk density.npz
? Does this produce an error message on the terminal?
i get the same error if i run straight from the example
(horton) khuddzu@moria:~/denspart/examples/horton3$ bash runall.sh
Loading file h2o_sto3g.fchk
Setting up grid
Computing stuff: 0 ... 10000 / 87300
basis
/home/khuddzu/mambaforge/envs/horton/lib/python3.11/site-packages/gbasis/contractions.py:415: RuntimeWarning: divide by zero encountered in divide
(2 * exponents / np.pi) ** (3 / 4)
density
Computing stuff: 10000 ... 20000 / 87300
basis
/home/khuddzu/mambaforge/envs/horton/lib/python3.11/site-packages/gbasis/evals/_deriv.py:142: RuntimeWarning: invalid value encountered in multiply
return np.tensordot(prim_coeffs, norm * zeroth_part * deriv_part, (0, 0))
density
Computing stuff: 20000 ... 30000 / 87300
basis
density
Computing stuff: 30000 ... 40000 / 87300
basis
density
Computing stuff: 40000 ... 50000 / 87300
basis
density
Computing stuff: 50000 ... 60000 / 87300
basis
density
Computing stuff: 60000 ... 70000 / 87300
basis
density
Computing stuff: 70000 ... 80000 / 87300
basis
density
Computing stuff: 80000 ... 87300 / 87300
basis
density
Traceback (most recent call last):
File "/home/khuddzu/mambaforge/envs/horton/bin/denspart-from-horton3", line 8, in <module>
sys.exit(main())
^^^^^^
File "/home/khuddzu/mambaforge/envs/horton/lib/python3.11/site-packages/denspart/adapters/horton3.py", line 219, in main
grid, data = prepare_input(
^^^^^^^^^^^^^^
File "/home/khuddzu/mambaforge/envs/horton/lib/python3.11/site-packages/denspart/adapters/horton3.py", line 85, in prepare_input
data = _compute_stuff(iodata, grid.points, gradient, orbitals, chunk_size)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/khuddzu/mambaforge/envs/horton/lib/python3.11/site-packages/denspart/adapters/horton3.py", line 210, in _compute_stuff
assert (result["density"] >= 0).all()
AssertionError
Traceback (most recent call last):
File "/home/khuddzu/mambaforge/envs/horton/bin/denspart", line 8, in <module>
sys.exit(main())
^^^^^^
File "/home/khuddzu/mambaforge/envs/horton/lib/python3.11/site-packages/denspart/__main__.py", line 40, in main
data = np.load(args.in_npz)
^^^^^^^^^^^^^^^^^^^^
File "/home/khuddzu/mambaforge/envs/horton/lib/python3.11/site-packages/numpy/lib/npyio.py", line 427, in load
fid = stack.enter_context(open(os_fspath(file), "rb"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'density.npz'
Traceback (most recent call last):
File "/home/khuddzu/mambaforge/envs/horton/bin/denspart-write-extxyz", line 8, in <module>
sys.exit(main())
^^^^^^
File "/home/khuddzu/mambaforge/envs/horton/lib/python3.11/site-packages/denspart/utils/write_extxyz.py", line 73, in main
results = np.load(args.out_npz)
^^^^^^^^^^^^^^^^^^^^^
File "/home/khuddzu/mambaforge/envs/horton/lib/python3.11/site-packages/numpy/lib/npyio.py", line 427, in load
fid = stack.enter_context(open(os_fspath(file), "rb"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'results.npz'
Any ideas?
Not immediately. The creation of the density.npz
fails because it detects negative densities on some grid points, which should not happen. There could be numerical artifacts when using large basis sets that lead to tiny negative values, but the included example uses a minimal basis set, which should be well behaved.
What is your operating system and how did you install numpy and scipy?
yeah it is doing it on the example file as well. i am using linux and i installed numpy and scipy through mamba. numpy version : 1.25.0 scipy version: 1.11.1
am i using the wrong versions?
I'm not sure what the issue is, but I just posted a Jupyter notebook that I confirmed works perfectly using CoLab. So it may be an issue with the command-line tool.
https://github.com/theochem/horton3/blob/master/notebooks/demo_DensPart.ipynb
Hey guys. I copied this notebook into the system I would run these calculations on. I am posting images of the error messages I am seeing. Everything goes smoothly up until the final cell. @tovrstra @PaulWAyers I am emailing the actual ipynb. I cannot share here as github is telling me they do not support that file type in an issue. Thank you for your help. Best, Kate
I realized I didn't provide the most important "errors" that I am getting.
# Make Becke-Lebedev molecular grid (using preset grid)
from grid.becke import BeckeWeights
from grid.molgrid import MolGrid
from grid.onedgrid import GaussChebyshev
from grid.rtransform import BeckeRTransform
oned = GaussChebyshev(100) rgrid = BeckeRTransform(1e-4, 1.5).transform_1d_grid(oned) grid = MolGrid.from_preset(mol.atnums, mol.atcoords, rgrid, "ultrafine", BeckeWeights())
The cell runs succesfully however I get this warning message.
/home/khuddzu/mambaforge/envs/horton/lib/python3.11/site-packages/grid/atomgrid.py:889: UserWarning: Lebedev weights are negative which can introduce round-off errors. sphere_grid = AngularGrid(degree=deg_i, use_spherical=use_spherical)
2. After running
from gbasis.wrappers import from_iodata from gbasis.evals.density import evaluate_density
import numpy as np #We need this to evaluate the 1DM from the MOs
one_rdm = np.dot(mol.mo.coeffs * mol.mo.occs, mol.mo.coeffs.T) basis = from_iodata(mol) density = evaluate_density(one_rdm, basis[0], grid.points, coord_type=basis[1])
I get this message, however the cell also runs successfully.
/home/khuddzu/mambaforge/envs/horton/lib/python3.11/site-packages/gbasis/contractions.py:415: RuntimeWarning: divide by zero encountered in divide (2 * exponents / np.pi) * (3 / 4) /home/khuddzu/mambaforge/envs/horton/lib/python3.11/site-packages/gbasis/evals/_deriv.py:142: RuntimeWarning: invalid value encountered in multiply return np.tensordot(prim_coeffs, norm zeroth_part * deriv_part, (0, 0))
When I run this demo code in our server, I get the error that I sent in the last comment.
However, when I run in colab on my system I do not get the same error. Also error 1 appears, but error 2 does not.
I switched my numpy version from numpy.1.25.0(the version in our server) to numpy 1.23.5 (the one on colab) and nothing changed.
In the server I printed this variable from gbasis contraction.py where I am getting the error on line 415. I printed self.angmom_components_cart
self.angmom_components_cart [[0 0 0]] self.angmom_components_cart [[0 0 0]] self.angmom_components_cart [[1 0 0] [0 1 0] [0 0 1]] self.angmom_components_cart [[0 0 0]] self.angmom_components_cart [[0 0 0]]
Does this look right to you guys? I can't really seem to find how to print the same through colab.
Any ideas why I am getting these values? @tovrstra
I've tried to reproduce the problem on my machine, with a setup as close as possible to yours. Mine is:
When I run with that setup the command denspart-from-horton3 h2o_sto3g.fchk density.npz
, I do not get the RuntimeWarning
messages that you receive. These are the source of the trouble, I assume. They probably introduce nans or infs in your density data, which breaks the remaining steps as well.
Can you verify that you are using the versions of the packages at the same commits as I do?
Another potential difference could be the BLAS or LAPACK version that your NumPy (or SciPy) is linked to. Normally, that should be consistent with my setup, as I'm also using mambaforge. Still, the linkage can sometimes be counterintuitive. Can you compare the output of np.show_config()
with mine?
In [1]: import numpy as np
In [2]: np.show_config()
openblas64__info:
libraries = ['openblas64_', 'openblas64_']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None), ('BLAS_SYMBOL_SUFFIX', '64_'), ('HAVE_BLAS_ILP64', None)]
runtime_library_dirs = ['/usr/local/lib']
blas_ilp64_opt_info:
libraries = ['openblas64_', 'openblas64_']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None), ('BLAS_SYMBOL_SUFFIX', '64_'), ('HAVE_BLAS_ILP64', None)]
runtime_library_dirs = ['/usr/local/lib']
openblas64__lapack_info:
libraries = ['openblas64_', 'openblas64_']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None), ('BLAS_SYMBOL_SUFFIX', '64_'), ('HAVE_BLAS_ILP64', None), ('HAVE_LAPACKE', None)]
runtime_library_dirs = ['/usr/local/lib']
lapack_ilp64_opt_info:
libraries = ['openblas64_', 'openblas64_']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None), ('BLAS_SYMBOL_SUFFIX', '64_'), ('HAVE_BLAS_ILP64', None), ('HAVE_LAPACKE', None)]
runtime_library_dirs = ['/usr/local/lib']
Supported SIMD extensions in this NumPy install:
baseline = SSE,SSE2,SSE3
found = SSSE3,SSE41,POPCNT,SSE42,AVX,F16C,FMA3,AVX2
not found = AVX512F,AVX512CD,AVX512_KNL,AVX512_KNM,AVX512_SKX,AVX512_CLX,AVX512_CNL,AVX512_ICL
I have none of these libraries in /usr/local/lib
, which means that NumPy will fall back to whatever is used to build the package on MambaForge.
np.show_config()
blas_armpl_info:
NOT AVAILABLE
blas_mkl_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/home/khuddzu/mambaforge/envs/horton/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/home/khuddzu/mambaforge/envs/horton/include']
blas_opt_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/home/khuddzu/mambaforge/envs/horton/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/home/khuddzu/mambaforge/envs/horton/include']
lapack_armpl_info:
NOT AVAILABLE
lapack_mkl_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/home/khuddzu/mambaforge/envs/horton/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/home/khuddzu/mambaforge/envs/horton/include']
lapack_opt_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/home/khuddzu/mambaforge/envs/horton/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/home/khuddzu/mambaforge/envs/horton/include']
Supported SIMD extensions in this NumPy install:
baseline = SSE,SSE2,SSE3
found = SSSE3,SSE41,POPCNT,SSE42,AVX,F16C,FMA3,AVX2
not found = AVX512F,AVX512CD,AVX512_KNL,AVX512_KNM,AVX512_SKX,AVX512_CLX,AVX512_CNL,AVX512_ICL
Okay so our np config is definitely different, but I am unfamiliar with this configuration, so I do not know if any of the differences matter.
mamba: mambaforge/condabin/mamba numpy: 1.23.5 python: 3.11.4 scipy: 1.11.1
These are the commands I used to install. And I reinstalled after developing this issue and you all not having the same thing.
# ! pip install git+https://github.com/theochem/iodata.git
# ! pip install git+https://github.com/theochem/grid.git
# ! pip install git+https://github.com/theochem/gbasis.git
# ! pip install git+https://github.com/theochem/denspart.git
I checked again on my side, turns out I made a mistake. (I was running it in IPython, which pointed back to my python packages outside the mamba env. Second try:
>>> import numpy as np
>>> np.show_config()
blas_info:
libraries = ['cblas', 'blas', 'cblas', 'blas']
library_dirs = ['/home/toon/mambaforge/envs/denspart/lib']
include_dirs = ['/home/toon/mambaforge/envs/denspart/include']
language = c
define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
define_macros = [('NO_ATLAS_INFO', 1), ('HAVE_CBLAS', None)]
libraries = ['cblas', 'blas', 'cblas', 'blas']
library_dirs = ['/home/toon/mambaforge/envs/denspart/lib']
include_dirs = ['/home/toon/mambaforge/envs/denspart/include']
language = c
lapack_info:
libraries = ['lapack', 'blas', 'lapack', 'blas']
library_dirs = ['/home/toon/mambaforge/envs/denspart/lib']
language = f77
lapack_opt_info:
libraries = ['lapack', 'blas', 'lapack', 'blas', 'cblas', 'blas', 'cblas', 'blas']
library_dirs = ['/home/toon/mambaforge/envs/denspart/lib']
language = c
define_macros = [('NO_ATLAS_INFO', 1), ('HAVE_CBLAS', None)]
include_dirs = ['/home/toon/mambaforge/envs/denspart/include']
Supported SIMD extensions in this NumPy install:
baseline = SSE,SSE2,SSE3
found = SSSE3,SSE41,POPCNT,SSE42,AVX,F16C,FMA3,AVX2
not found = AVX512F,AVX512CD,AVX512_SKX,AVX512_CLX,AVX512_CNL,AVX512_ICL,AVX512_SPR
Your version of NumPy seems to be linked to MKL, whereas mine is not. This means you must have a slightly different NumPy. (This not so much the version number, but rather how it was compiled and packaged.)
Can you show the following outputs? (mine included for comparison)
$ mamba list numpy
# packages in environment at /home/toon/mambaforge/envs/denspart:
#
# Name Version Build Channel
numpy 1.25.2 py311h64a7726_0 conda-forge
$ mamba info
mamba version : 1.5.0
active environment : denspart
active env location : /home/toon/mambaforge/envs/denspart
shell level : 2
user config file : /home/toon/.condarc
populated config files : /home/toon/mambaforge/.condarc
conda version : 23.7.3
conda-build version : not installed
python version : 3.10.12.final.0
virtual packages : __archspec=1=x86_64
__glibc=2.37=0
__linux=6.4.13=0
__unix=0=0
base environment : /home/toon/mambaforge (writable)
conda av data dir : /home/toon/mambaforge/etc/conda
conda av metadata url : None
channel URLs : https://conda.anaconda.org/conda-forge/linux-64
https://conda.anaconda.org/conda-forge/noarch
package cache : /home/toon/mambaforge/pkgs
/home/toon/.conda/pkgs
envs directories : /home/toon/mambaforge/envs
/home/toon/.conda/envs
platform : linux-64
user-agent : conda/23.7.3 requests/2.31.0 CPython/3.10.12 Linux/6.4.13-200.fc38.x86_64 fedora/38 glibc/2.37
UID:GID : 1000:1000
netrc file : None
offline mode : False
(horton) khuddzu@moria:~$ mamba list numpy
# packages in environment at /home/khuddzu/mambaforge/envs/horton:
#
# Name Version Build Channel
numpy 1.25.0 pypi_0 pypi
numpy-base 1.25.2 py311hf175353_0
mamba (1.0.0) supported by @QuantStack
GitHub: https://github.com/mamba-org/mamba
Twitter: https://twitter.com/QuantStack
active environment : horton
active env location : /home/khuddzu/mambaforge/envs/horton
shell level : 3
user config file : /home/khuddzu/.condarc
populated config files : /home/khuddzu/mambaforge/.condarc
/home/khuddzu/.condarc
conda version : 22.9.0
conda-build version : not installed
python version : 3.10.6.final.0
virtual packages : __linux=4.15.0=0
__glibc=2.27=0
__unix=0=0
__archspec=1=x86_64
base environment : /home/khuddzu/mambaforge (writable)
conda av data dir : /home/khuddzu/mambaforge/etc/conda
conda av metadata url : None
channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/r/linux-64
https://repo.anaconda.com/pkgs/r/noarch
https://conda.anaconda.org/rdkit/linux-64
https://conda.anaconda.org/rdkit/noarch
https://conda.anaconda.org/conda-forge/linux-64
https://conda.anaconda.org/conda-forge/noarch
package cache : /home/khuddzu/mambaforge/pkgs
/home/khuddzu/.conda/pkgs
envs directories : /home/khuddzu/mambaforge/envs
/home/khuddzu/.conda/envs
platform : linux-64
user-agent : conda/22.9.0 requests/2.28.1 CPython/3.10.6 Linux/4.15.0-136-generic ubuntu/18.04.1 glibc/2.27
UID:GID : 1047:1001
netrc file : /home/khuddzu/.netrc
offline mode : False
I guess if the way in which you compile numpy causes the program to produce Nan values, rendering it useless, this seems like a greater issue to me. I can use it on google colab. But on both of the servers we have, hipergator (supercomputer fro UF) and moria(roitberg group personal server) I get the same Nan values. I need to use it on one of these, because I am working with a lot of data. If there is a specific way on which you can guide to creating the mamba env, then that is an easy fix. But if not, is there a way to change the source code to ensure Nan values are not computed? It seems like ensuring there is no division of zero could be hard coded. I am happy to help with this project. However, I would need a bit of guidance to make sure I am not screwing up your mathematics.
My best guess is not to use Intel MKL. It is notorious for being an efficient way of getting wrong BLAS and LAPACK results. It may also be related to certain compiler optimizations being too aggressive. Hard to tell at this stage. It can be hard to predict when such problems will happen. Hence, guarding against them in higher level code is practically impossible. Even when you filter out the NaNs, you might not want to trust the finite values either.
To give some more solid advice or fix something, I should be able to reproduce your problem. I believe the outputs of the commands above could help me with that.
Okay great! Thank you so much! Unfortunately I do not have much control over the use of Intel MKL in the servers, I do not believe.
Normally, this just depends on how you set up conda or mamba, which you can always do in your home directory, no need for admin permissions. Anyway, even if conda or mamba are installed centrally, it would be useful to reproduce your problem here. If you can share the output of conda info
and conda list numpy
, that should be possible.
conda info
(horton) khuddzu@moria:~$ conda info
active environment : horton
active env location : /home/khuddzu/mambaforge/envs/horton
shell level : 3
user config file : /home/khuddzu/.condarc
populated config files : /home/khuddzu/mambaforge/.condarc
/home/khuddzu/.condarc
conda version : 22.9.0
conda-build version : not installed
python version : 3.10.6.final.0
virtual packages : __linux=4.15.0=0
__glibc=2.27=0
__unix=0=0
__archspec=1=x86_64
base environment : /home/khuddzu/mambaforge (writable)
conda av data dir : /home/khuddzu/mambaforge/etc/conda
conda av metadata url : None
channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/r/linux-64
https://repo.anaconda.com/pkgs/r/noarch
https://conda.anaconda.org/rdkit/linux-64
https://conda.anaconda.org/rdkit/noarch
https://conda.anaconda.org/conda-forge/linux-64
https://conda.anaconda.org/conda-forge/noarch
package cache : /home/khuddzu/mambaforge/pkgs
/home/khuddzu/.conda/pkgs
envs directories : /home/khuddzu/mambaforge/envs
/home/khuddzu/.conda/envs
platform : linux-64
user-agent : conda/22.9.0 requests/2.28.1 CPython/3.10.6 Linux/4.15.0-136-generic ubuntu/18.04.1 glibc/2.27
UID:GID : 1047:1001
netrc file : /home/khuddzu/.netrc
offline mode : False
conda list numpy
(horton) khuddzu@moria:~$ conda list numpy
# packages in environment at /home/khuddzu/mambaforge/envs/horton:
#
# Name Version Build Channel
numpy 1.25.0 pypi_0 pypi
numpy-base 1.25.2 py311hf175353_0
So same as above. Let me know if there is anything else you need.
Found it... (and sorry for having missed your earlier comment with conda outputs)
The reason is that in Scipy 1.11.*, the factorial2
function has been rewritten (in May this year).
With version 1.10.1
, I get the following results:
>>> from scipy.special import factorial2
>>> factorial2(-1)
array(1.)
>>> factorial2(-2)
array(0.)
>>> factorial2(-3)
array(0.)
>>> factorial2(-4)
array(0.)
>>> factorial2(-5)
array(0.)
With version 1.11.1
, I get different results, in line with the documentation:
>>> from scipy.special import factorial2
>>> factorial2(-1)
0
>>> factorial2(-2)
0
>>> factorial2(-3)
0
>>> factorial2(-4)
0
>>> factorial2(-5)
0
The GBasis code has several expressions with factorial2(2 * angmom_components_cart - 1)
in the denominator. With SciPy 1.11.1, this will result in a division by zero, when any element in angmom_components_cart
is zero. This will happen in practically any case, irrespective of the type of basis set.
I believe the implementation in the latest SciPy deviates from the standard handling of negative arguments of double factorials. The way it is explained on Wikipedia is more common (in my experience). Idealy, the code above should produce the following:
>>> from scipy.special import factorial2
>>> factorial2(-1)
1
>>> factorial2(-2)
nan
>>> factorial2(-3)
-1
>>> factorial2(-4)
nan
>>> factorial2(-5)
0.3333333333333333
@PaulWAyers Any thoughts? I'm inclined to open an issue on SciPy. Even if it gets fixed in a future version of SciPy, some workaround in GBasis will be needed.
There is an issue on SciPy with a request for comments on future changes to the factorial*
functions in scipy.special
. I've added a comment to explain our use case. See https://github.com/scipy/scipy/issues/18409
Hey! I changed my scipy version and it seems to be working well now! Thank you for all of your help! I do think this is an error on their end though. Glad you opened an issue with them. I'm sure horton isn't the only software running into this.
@PaulWAyers @tovrstra after yesterdays update of Denspart https://github.com/theochem/horton3/blob/master/notebooks/demo_DensPart.ipynb this script is not working.
@bhavukRohilla this is a gbasis
issue. For now, you need to use an older version of scipy, as described above.
@tovrstra I saw the issue you're discussing with scipy
. That would be the easiest fix. I had been planning to look at gbasis
and "wrap" the double-factorial in a new function that returned the expected values. But I didn't get around to it. I think that would be the only sensible alternative if changing the behavior of scipy
is too slow/hard.
I just tested this again, and had made some updates to denspart for small API changes in grid and gbasis. When you try again, please recreate your development environment with latest versions of dependencies.
Hi, I was wondering if I could receive some guidance on how to get MBIS charges from a directory of fchk files. I am running into some issues, that are definitely user error and could use some help. Best, Kate