graphcore-research / pyscf-ipu

PySCF on IPU
https://github.com/graphcore-research/pyscf-ipu#readme
Apache License 2.0
42 stars 2 forks source link

nanoDFT fails with OOM with IPU backend #16

Open hatemhelal opened 1 year ago

hatemhelal commented 1 year ago

The default choice of 6-31G for the basis set results in nanoDFT failing with OOM when targeting the IPU backend.

repro:

python nanoDFT.py --backend ipu --float32 True

possible workaround:

python nanoDFT.py --backend ipu --float32 True --basis sto-3g

but it would be a better overall experience if the defaults worked out of the box for the IPU.

cc: @ajwilkinson

hatemhelal commented 1 year ago

@AlexanderMath is there any reason to prefer using the 6-31G basis by default? I'd like to make sto-3g the default.

hatemhelal commented 1 year ago

The nanoDFT notebook already implements the suggested workaround

opts, _ = nanoDFT_options(backend="ipu", float32=True, basis="sto-3g")
AlexanderMath commented 1 year ago

No reason, sto3g default sounds reasonable for initial playing around with code. We probably removed a few memory optimizations from nanoDFT in favor of simplicity, hence the OOM. Did anyone reproduce bug for density_functional_theory.py ?

hatemhelal commented 1 year ago

Fixed the default in nanoDFT in 90dc61c

Did anyone reproduce bug for density_functional_theory.py ?

Haven't tried this yet.

ajwilkinson commented 1 year ago

Ok so it now does work on a classic with sto-3g... one observation though is that it compiles every time - is that me being some sort of noob and not configuring something in the poplar environment to cache a previous compilation?

hatemhelal commented 1 year ago

Documentation is spread over quite a few resources. The notebooks both use the executable cache from the env var:

# JAX/XLA IPU compilation cache.
os.environ['TF_POPLAR_FLAGS'] = """
  --executable_cache_path=/tmp/ipu-ef-cache
"""

Can you try this out if you aren't using it already?

That said, I would expect to see recompilation if you are changing the molecular structure (e.g. adding/removing atoms) or the basis set.

AlexanderMath commented 1 year ago

Added the following to both density_functional_theory.py and nanoDFT.py so it caches by default.

os.environ['TF_POPLAR_FLAGS'] = """--executable_cache_path=/tmp/ipu-pyscf-cache/"""

Note: The computational graph shouldn't change between molecules with the same number of atomic orbitals, so changing from [C, O] to [N, F] should not cause a recompilation.