ACEsuit / mace

MACE - Fast and accurate machine learning interatomic potentials with higher order equivariant message passing.
Other
495 stars 181 forks source link

matscipy neighbour list as default? #52

Closed gelzinyte closed 10 months ago

gelzinyte commented 1 year ago

Matscipy neighbourlist (from @davkovacs matscipy branch) is much faster: 5 seconds vs 5 HOURS on the graph construction step for 10 000 small molecules.

davkovacs commented 1 year ago

Since the installation of matscipy is not stable on all architectures we can't make it the default, but I am for making it an optional dependency and adding a keyword for using it to run_train. @ilyes319 if you would be happy with that I can do it quickly.

gabor1 commented 1 year ago

Can you look into why it's "not stable on all architectures" ? can you make it the default, but try-catch if it doesn't install properly and fall back on something else (and WARN) ? It is dangerous to make a really slow thing the default, lots of people will just not notice.

ilyes319 commented 1 year ago

It just does not compile on my windows machine with my current compiler. It is also dangerous to put as default something hard to install. ASE has the advantage of being a robust dependency. But I am keen on having it as optional dependency. Btw @gelzinyte what is the average size of your molecules?

gabor1 commented 1 year ago

is windows support is worth compromising the speed on other architecture ? can we create two installation sequences, one for windows and one for other things ?

— Gábor

Gabor Csanyi Professor of Molecular Modelling Engineering Laboratory University of Cambridge

------- Original Message ------- On Tuesday, November 22nd, 2022 at 11:46, Ilyes Batatia @.***> wrote:

It just not compile on my windows machine with my current compiler. It is also dangerous to put as default something hard to install. ASE has the advantage of being a robust dependency. But I am keen on having it as optional dependency. Btw @.***(https://github.com/gelzinyte) what is the average size of your molecules?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

gelzinyte commented 1 year ago

I think the problem is that matscipy are in the process of making wheels for windows (and debugging for other platforms) - see https://github.com/libAtoms/matscipy/issues/106 . So it sounds that in theory matscipy should be stable once that's completed?

Btw gelzinyte what is the average size of your molecules?

40 atoms

ilyes319 commented 1 year ago

is windows support is worth compromising the speed on other architecture ? can we create two installation sequences, one for windows and one for other things ?

That's a good question. If one makes it an optional dependency, it does not compromise the speed. It is convenient to have windows compatibility, mainly because other packages are compatible, and I have often chosen alternatives based on that fact. Maybe we could ask @jameskermode about the state of windows wheels?

jameskermode commented 1 year ago

Sure we can get windows wheels to work. There’s a matscipy issue already, can you add a comment to raise priority of this?

jameskermode commented 1 year ago

I tried a bit more with the Windows wheel builder, but am hampered by not having physical access to a Windows machine to test on. I could try with a VM I suppose.

@ilyes319 what C/C++ compiler were you trying to use? It should be sufficient to ensure compiler is on the path and then set the CC and CXX environment variables to point to the compiler executable before installing, e.g. with bash and Visual C++

CC=cl.exe CXX=cl.exe pip install matscipy
gelzinyte commented 1 year ago

To point out - with matscipy all the configs must have a cell, even if non-periodic. Otherwise I am getting unreasonably large energy/force values (as high as 1e10), without any errors (didn't dig any deeper).

bernstei commented 1 year ago

To point out - with matscipy all the configs must have a cell, even if non-periodic. Otherwise I am getting unreasonably large energy/force values (as high as 1e10), without any errors (didn't dig any deeper).

I think I'd call that a matscipy bug. Especially if that implementation is derived from libAtoms, which I think it is, via @jameskermode, it should be able to support non-periodic conditions.

jameskermode commented 1 year ago

I think matscipy neighbourlist does work with non-periodic boundary conditions - if not it's indeed a bug, but should be reported in the matscipy repo, not here.

jameskermode commented 1 year ago

Please can you try these wheels? Need to download the relevant .zip, extract, then install with pip install WHEELFILE.whl

https://github.com/libAtoms/matscipy/actions/runs/3633584615

jameskermode commented 1 year ago

(Once we're happy they work we can release to PyPI so a simple pip install matscipy will defaiult to using the binary wheels, falling back on the source tarball only if a suitable wheel is not available.)

ilyes319 commented 1 year ago

Hey @jameskermode,

I am not sure if I understand the process. On which branch the WHEELFILE.whl file is present? This runs points to 22-meson, but I can not find it there.

jameskermode commented 1 year ago

Wheels are not committed to the branch but built in CI and stored as artefacts. If you follow this link then scroll down you can see the zip files for each platform: https://github.com/libAtoms/matscipy/actions/runs/3686909556

e.g. for Python 3.8 on Windows AMD64 this is the zip file containing the .whl

https://github.com/libAtoms/matscipy/suites/9842091659/artifacts/474285379

ilyes319 commented 1 year ago

It works great !! I could install the package on my machine. Not sure if I need to report it here, but when I import matscipy.neighbours I get the following error:

  File "<stdin>", line 1, in <module>
  File "C:\Users\Lilyes\anaconda3\envs\torch_nightly\lib\site-packages\matscipy\neighbours.py", line 35, in <module>
    from . import ffi
  File "C:\Users\Lilyes\anaconda3\envs\torch_nightly\lib\site-packages\matscipy\ffi.py", line 38, in <module>
    from _matscipy import *  # noqa
ModuleNotFoundError: No module named '_matscipy'

( I reported it there https://github.com/libAtoms/matscipy/issues/106)

stenczelt commented 1 year ago

Are there any compiled dependencies of the neighbourlist module in matscipy? we could just copy that module and it's Python dependencies.

Alternatively we can implement the same logic in PyTorch as well, so that the neighbour list generation is done on the GPU as well.

jameskermode commented 1 year ago

Neighbour list is in C which is why it’s fast. Writing a new one and getting all the bugs out is quite an undertaking (took several years to get this to be robust). Sure we’ll be able to fix the minor packaging problem, see the discussion on the linked issue.

gabor1 commented 1 year ago

I think these are both needed eventually. In the short term, the matscipy neighbour list is the best independent code there is and we should maintain it and package it. But I also agree that in the long run there will be a need for a fully featured neighbour list entirely on the GPU.

bernstei commented 1 year ago

Surely something like HOOMD blue (or some other GPU-centric MD code) has a fully GPU neighbor list. Obviously there could be licensing issues, but we should look at something like that before rolling our own, I suspect.

gabor1 commented 1 year ago

Yeah, good idea

stenczelt commented 1 year ago

Perhaps an initial step would be to do all of the ASE neighborlist code with torch rather than numpy?

davkovacs commented 1 year ago

Simply translating ase from numpy to torch has already been done. I had it at some point on my fork but then switched to this one: https://github.com/felixmusil/torch_nl

ilyes319 commented 1 year ago

@jameskermode The windows wheels works great! I think we can try to make it the defaults now! @davkovacs Could you open a PR for that?

jameskermode commented 1 year ago

great, glad we got this sorted for you

jameskermode commented 1 year ago

We still need to do a new release to PyPI but can do that soon

ilyes319 commented 1 year ago

That would be great, thanks!

ilyes319 commented 1 year ago

For merging #64:

@davkovacs Do we need anything more to merge #64 ?

davkovacs commented 1 year ago

more testing on periodic data will be necessary.

jameskermode commented 1 year ago

There is now a v0.8.0-rc8 pre-release of matscipy you can install from PyPI via

pip install matscipy==0.8.0rc8

A slight regression is that in adding Windows support and upgrading the build system we've temporarily dropped macOS arm64 wheels, but they'll be back before the v0.8.0 final release, and manual compiling on macOS isn't as hard as on Windows.

jameskermode commented 1 year ago

https://pypi.org/project/matscipy/0.8.0/ is now out, including Windows wheels, so you should be fine to make this change now

ilyes319 commented 1 year ago

Amazing thank you!

wcwitt commented 10 months ago

Am I right that we could close this now?

ilyes319 commented 10 months ago

yes closing.