Closed gelzinyte closed 10 months ago
Since the installation of matscipy
is not stable on all architectures we can't make it the default, but I am for making it an optional dependency and adding a keyword for using it to run_train
.
@ilyes319 if you would be happy with that I can do it quickly.
Can you look into why it's "not stable on all architectures" ? can you make it the default, but try-catch if it doesn't install properly and fall back on something else (and WARN) ? It is dangerous to make a really slow thing the default, lots of people will just not notice.
It just does not compile on my windows machine with my current compiler. It is also dangerous to put as default something hard to install. ASE has the advantage of being a robust dependency. But I am keen on having it as optional dependency. Btw @gelzinyte what is the average size of your molecules?
is windows support is worth compromising the speed on other architecture ? can we create two installation sequences, one for windows and one for other things ?
— Gábor
Gabor Csanyi Professor of Molecular Modelling Engineering Laboratory University of Cambridge
------- Original Message ------- On Tuesday, November 22nd, 2022 at 11:46, Ilyes Batatia @.***> wrote:
It just not compile on my windows machine with my current compiler. It is also dangerous to put as default something hard to install. ASE has the advantage of being a robust dependency. But I am keen on having it as optional dependency. Btw @.***(https://github.com/gelzinyte) what is the average size of your molecules?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
I think the problem is that matscipy are in the process of making wheels for windows (and debugging for other platforms) - see https://github.com/libAtoms/matscipy/issues/106 . So it sounds that in theory matscipy should be stable once that's completed?
Btw gelzinyte what is the average size of your molecules?
40 atoms
is windows support is worth compromising the speed on other architecture ? can we create two installation sequences, one for windows and one for other things ?
That's a good question. If one makes it an optional dependency, it does not compromise the speed. It is convenient to have windows compatibility, mainly because other packages are compatible, and I have often chosen alternatives based on that fact. Maybe we could ask @jameskermode about the state of windows wheels?
Sure we can get windows wheels to work. There’s a matscipy issue already, can you add a comment to raise priority of this?
I tried a bit more with the Windows wheel builder, but am hampered by not having physical access to a Windows machine to test on. I could try with a VM I suppose.
@ilyes319 what C/C++ compiler were you trying to use? It should be sufficient to ensure compiler is on the path and then set the CC
and CXX
environment variables to point to the compiler executable before installing, e.g. with bash and Visual C++
CC=cl.exe CXX=cl.exe pip install matscipy
To point out - with matscipy all the configs must have a cell, even if non-periodic. Otherwise I am getting unreasonably large energy/force values (as high as 1e10), without any errors (didn't dig any deeper).
To point out - with matscipy all the configs must have a cell, even if non-periodic. Otherwise I am getting unreasonably large energy/force values (as high as 1e10), without any errors (didn't dig any deeper).
I think I'd call that a matscipy bug. Especially if that implementation is derived from libAtoms, which I think it is, via @jameskermode, it should be able to support non-periodic conditions.
I think matscipy neighbourlist does work with non-periodic boundary conditions - if not it's indeed a bug, but should be reported in the matscipy repo, not here.
Please can you try these wheels? Need to download the relevant .zip, extract, then install with pip install WHEELFILE.whl
https://github.com/libAtoms/matscipy/actions/runs/3633584615
(Once we're happy they work we can release to PyPI so a simple pip install matscipy
will defaiult to using the binary wheels, falling back on the source tarball only if a suitable wheel is not available.)
Hey @jameskermode,
I am not sure if I understand the process. On which branch the WHEELFILE.whl file is present? This runs points to 22-meson, but I can not find it there.
Wheels are not committed to the branch but built in CI and stored as artefacts. If you follow this link then scroll down you can see the zip files for each platform: https://github.com/libAtoms/matscipy/actions/runs/3686909556
e.g. for Python 3.8 on Windows AMD64 this is the zip file containing the .whl
https://github.com/libAtoms/matscipy/suites/9842091659/artifacts/474285379
It works great !! I could install the package on my machine. Not sure if I need to report it here, but when I import matscipy.neighbours
I get the following error:
File "<stdin>", line 1, in <module>
File "C:\Users\Lilyes\anaconda3\envs\torch_nightly\lib\site-packages\matscipy\neighbours.py", line 35, in <module>
from . import ffi
File "C:\Users\Lilyes\anaconda3\envs\torch_nightly\lib\site-packages\matscipy\ffi.py", line 38, in <module>
from _matscipy import * # noqa
ModuleNotFoundError: No module named '_matscipy'
( I reported it there https://github.com/libAtoms/matscipy/issues/106)
Are there any compiled dependencies of the neighbourlist module in matscipy? we could just copy that module and it's Python dependencies.
Alternatively we can implement the same logic in PyTorch as well, so that the neighbour list generation is done on the GPU as well.
Neighbour list is in C which is why it’s fast. Writing a new one and getting all the bugs out is quite an undertaking (took several years to get this to be robust). Sure we’ll be able to fix the minor packaging problem, see the discussion on the linked issue.
I think these are both needed eventually. In the short term, the matscipy neighbour list is the best independent code there is and we should maintain it and package it. But I also agree that in the long run there will be a need for a fully featured neighbour list entirely on the GPU.
Surely something like HOOMD blue (or some other GPU-centric MD code) has a fully GPU neighbor list. Obviously there could be licensing issues, but we should look at something like that before rolling our own, I suspect.
Yeah, good idea
Perhaps an initial step would be to do all of the ASE neighborlist code with torch rather than numpy?
Simply translating ase from numpy to torch has already been done. I had it at some point on my fork but then switched to this one: https://github.com/felixmusil/torch_nl
@jameskermode The windows wheels works great! I think we can try to make it the defaults now! @davkovacs Could you open a PR for that?
great, glad we got this sorted for you
We still need to do a new release to PyPI but can do that soon
That would be great, thanks!
For merging #64:
@davkovacs Do we need anything more to merge #64 ?
more testing on periodic data will be necessary.
There is now a v0.8.0-rc8 pre-release of matscipy you can install from PyPI via
pip install matscipy==0.8.0rc8
A slight regression is that in adding Windows support and upgrading the build system we've temporarily dropped macOS arm64 wheels, but they'll be back before the v0.8.0 final release, and manual compiling on macOS isn't as hard as on Windows.
https://pypi.org/project/matscipy/0.8.0/ is now out, including Windows wheels, so you should be fine to make this change now
Amazing thank you!
Am I right that we could close this now?
yes closing.
Matscipy neighbourlist (from @davkovacs
matscipy
branch) is much faster: 5 seconds vs 5 HOURS on the graph construction step for 10 000 small molecules.