CederGroupHub / chgnet

Pretrained universal neural network potential for charge-informed atomistic modeling https://chgnet.lbl.gov
https://doi.org/10.1038/s42256-023-00716-3
Other
226 stars 61 forks source link

`ruff` fixes #184

Closed DanielYang59 closed 20 hours ago

DanielYang59 commented 1 month ago

Summary

Relocate to another PR

DanielYang59 commented 1 month ago

@janosh Can you please approve the workflow? Thanks!

DanielYang59 commented 1 month ago

Thanks! I believe the test failure is owing to the lack of support of NumPy 2.0 from pymatgen's side https://github.com/materialsproject/pymatgen/issues/3953. Not sure what's the best practice at this moment as pymatgen is waiting for chgnet to support NumPy 2.0 too https://github.com/materialsproject/pymatgen/pull/3894#issuecomment-2190098821 (perhaps use np2 for build system and np1 for runtime dependency?)

janosh commented 1 month ago

Pymatgen should go first and ignore failing chgnet tests

DanielYang59 commented 1 month ago

Yes I just realized the same, pymatgen should build against np2 first :) And it would be compatible with np1 at runtime

DanielYang59 commented 1 month ago

@janosh It looks like you have to approve the workflow again for some reason?

janosh commented 1 month ago

until your first PR to a repo is merged, workflows need to be re-approved on every commit. it's quite annoying

DanielYang59 commented 2 days ago

Hi @janosh if i remember correctly, this cannot be merged until a new release of pymatgen is issued.

janosh commented 2 days ago

@DanielYang59 thanks for the reminder. pymatgen v2024.9.10 should be on PyPI in a few minutes

DanielYang59 commented 2 days ago

@janosh Have to ping you to approve the workflow (again), hopefully no test fails this time

DanielYang59 commented 1 day ago

@janosh Should be good this time :)

DanielYang59 commented 1 day ago

Looks like something is still not right with int type for windows (with pymatgen 2024.9.10), changing to my Windows PC for debugging...

____________________ ERROR collecting tests/test_model.py _____________________
tests\test_model.py:14: in <module>
    graph = CrystalGraphConverter()(structure, graph_id="test-model")
C:\hostedtoolcache\windows\Python\3.12.5\x64\Lib\site-packages\torch\nn\modules\module.py:1511: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
C:\hostedtoolcache\windows\Python\3.12.5\x64\Lib\site-packages\torch\nn\modules\module.py:1520: in _call_impl
    return forward_call(*args, **kwargs)
chgnet\graph\converter.py:135: in forward
    graph = self.create_graph(
chgnet\graph\converter.py:255: in _create_graph_fast
    nodes, dir_edges_list, undir_edges_list, undirected_edges = make_graph(
chgnet\graph\cygraph.pyx:65: in chgnet.graph.cygraph.make_graph
    const long[::1] center_index,
E   ValueError: Buffer dtype mismatch, expected 'const long' but got 'long long'
janosh commented 1 day ago

based on the numpy docs for intp_t, maybe we should change the type signature to this?

def make_graph(
    const intp_t[::1] center_index,
    const intp_t n_e,
    const intp_t[::1] neighbor_index,
    const intp_t[:, ::1] image,
    const float64_t[::1] distance,
    const intp_t num_atoms
):

prev

def make_graph(
    const long[::1] center_index,
    const long n_e,
    const long[::1] neighbor_index,
    const long[:, ::1] image,
    const double[::1] distance,
    const long num_atoms
):
DanielYang59 commented 1 day ago

Yes! Changed to numpy/cython type, tests seem to pass locally on my Windows PC

Note though, intp_t/float64_t is not standard type but from numpy, also intp_t is platform dependent.

Test NP1 on windows passes:

$ pip freeze | grep numpy
numpy==1.26.0
(venv)
yang@Yang-NUC MINGW64 /d/chgnet (numpy2)
$ pytest tests/test_graph.py
============================================================================ test session starts ============================================================================ 
platform win32 -- Python 3.12.0, pytest-8.3.3, pluggy-1.5.0
rootdir: D:\chgnet
configfile: pyproject.toml
plugins: cov-5.0.0
collected 8 items

tests\test_graph.py ........                                                                                                                                           [100%] 

============================================================================= 8 passed in 2.55s =============================================================================
DanielYang59 commented 1 day ago

I would revert 45a1fd75dcbb622a4d472d5004371cd1a87b6d69, numpy is needed to cythonize and cannot be imported lazily. I believe the problem is:

https://github.com/CederGroupHub/chgnet/blob/73d6096790bbb1e20d5f2e1e23119cca34e54483/.github/workflows/test.yml#L35-L41

As build is using a isolate environment (PEP-517)

DanielYang59 commented 1 day ago

Looks like there is another error just on windows for some reason:

____________________ ERROR collecting tests/test_model.py _____________________
chgnet\graph\converter.py:146: in forward
    bond_graph, undirected2directed = graph.line_graph_adjacency_list(
chgnet\graph\graph.py:272: in line_graph_adjacency_list
    raise ValueError(
E   ValueError: Error: number of directed edges=672 != 2 * number of undirected edges=473!This indicates directed edges are not complete
The above exception was the direct cause of the following exception:
tests\test_model.py:14: in <module>
    graph = CrystalGraphConverter()(structure, graph_id="test-model")
C:\hostedtoolcache\windows\Python\3.10.11\x64\lib\site-packages\torch\nn\modules\module.py:1553: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
C:\hostedtoolcache\windows\Python\3.10.11\x64\lib\site-packages\torch\nn\modules\module.py:1562: in _call_impl
    return forward_call(*args, **kwargs)
chgnet\graph\converter.py:153: in forward
    raise RuntimeError(
E   RuntimeError: Failed creating bond graph for test-model, check bond_graph_error.cif

Would have a look tomorrow.

DanielYang59 commented 1 day ago

Also there is another seemingly error for Ubuntu/MacOS, not sure if this is related to this PR:

tests/test_encoders.py .................
!!!!!! the two directed edges are equal but this operation is not supposed to happen
!!!!!! the two directed edges are equal but this operation is not supposed to happen
tests/test_graph.py ........
janosh commented 1 day ago

maybe @BowenD-UCB can chime on this test failure which oddly only happens on Windows? perhaps this signals that one of the dtype changes in make_graph is still platform dependent or needs to be reverted?

chgnet\graph\converter.py:148: in forward
    bond_graph, undirected2directed = graph.line_graph_adjacency_list(
chgnet\graph\graph.py:272: in line_graph_adjacency_list
    raise ValueError(
E   ValueError: Error: number of directed edges=672 != 2 * number of undirected edges=473!This indicates directed edges are not complete
BowenD-UCB commented 1 day ago

maybe @BowenD-UCB can chime on this test failure which oddly only happens on Windows? perhaps this signals that one of the dtype changes in make_graph is still platform dependent or needs to be reverted?

chgnet\graph\converter.py:148: in forward
    bond_graph, undirected2directed = graph.line_graph_adjacency_list(
chgnet\graph\graph.py:272: in line_graph_adjacency_list
    raise ValueError(
E   ValueError: Error: number of directed edges=672 != 2 * number of undirected edges=473!This indicates directed edges are not complete

@DanielYang59 @janosh Yes I'll look into this later. Sorry about the delay on the PR, been a bit busy lately.

DanielYang59 commented 1 day ago

perhaps this signals that one of the dtype changes in make_graph is still platform dependent or needs to be reverted?

Concur, we should focus on inferred (implicit) types, i.e. np.array([0, 1]) whose type would still be platform dependent.

Instead of just changing the type in tests, we should also update the code where int is expected and automatically cast the type to int64. Otherwise users might still experience issues if they don't explicitly declare types.

However, I'm not familiar with the code base so if @BowenD-UCB could give a hand that would very helpful.

Sorry about the delay on the PR, been a bit busy lately.

It has not been delayed at all :) pymatgen just unblocked this.

DanielYang59 commented 1 day ago

@janosh I reverted all (I believe) numpy related change from this PR, and relocate NumPy v2 migration to another PR as it turns out there's more debugging needed than I expected.

Might need to replace the labels for this PR

DanielYang59 commented 20 hours ago

No problem, thanks for merging this.