ACEsuit / mace

MACE - Fast and accurate machine learning interatomic potentials with higher order equivariant message passing.
Other
415 stars 157 forks source link

Machines other than A100 compiling LAMMPS #496

Closed stargolike closed 5 days ago

stargolike commented 6 days ago

When I compiled following the tutorial using the v100 machine.

cd lammps
mkdir build-ampere
cd build-ampere
cmake \
    -D CMAKE_BUILD_TYPE=Release \
    -D CMAKE_INSTALL_PREFIX=$(pwd) \
    -D CMAKE_CXX_STANDARD=17 \
    -D CMAKE_CXX_STANDARD_REQUIRED=ON \
    -D BUILD_MPI=ON \
    -D BUILD_SHARED_LIBS=ON \
    -D PKG_KOKKOS=ON \
    -D Kokkos_ENABLE_CUDA=ON \
    -D CMAKE_CXX_COMPILER=$(pwd)/../lib/kokkos/bin/nvcc_wrapper \
    -D Kokkos_ARCH_AMDAVX=ON \
    -D Kokkos_ARCH_AMPERE100=ON \
    -D CMAKE_PREFIX_PATH=$(pwd)/../../libtorch-gpu \
    -D PKG_ML-MACE=ON \
    ../cmake
make -j 10
make install

But when i use lammps. error happened. Kokkos::Cuda::initialize ERROR: likely mismatch of architecture and i found some issues. https://github.com/kokkos/kokkos/issues/6565 and i rewrite this command

cd lammps
mkdir build-4090
cd build-4090/
cmake \
    -D CMAKE_BUILD_TYPE=Release \
    -D CMAKE_INSTALL_PREFIX=$(pwd) \
    -D CMAKE_CXX_STANDARD=17 \
    -D CMAKE_CXX_STANDARD_REQUIRED=ON \
    -D BUILD_MPI=ON \
    -D BUILD_SHARED_LIBS=ON \
    -D PKG_KOKKOS=ON \
    -D Kokkos_ENABLE_CUDA=ON \
    -D CMAKE_CXX_COMPILER=$(pwd)/../lib/kokkos/bin/nvcc_wrapper \
    -D CMAKE_PREFIX_PATH=$(pwd)/../../libtorch-gpu \
    -D PKG_ML-MACE=ON \
    ../cmake

make -j 20
make install

then i rename lmp file to lmp_4090, and add environmental variables export PATH=$PATH:/root/lammps/build-4090 i run lmp_4090 -k on g 1 -sf kk -in deform.in and it's successful. i think we should take into account the existence of this situation

wcwitt commented 6 days ago

It looks like things worked after you removed these architecture-related lines.

    -D Kokkos_ARCH_AMDAVX=ON \
    -D Kokkos_ARCH_AMPERE100=ON \

That's normal and expected - see these LAMMPS Kokkos docs for more details: https://docs.lammps.org/Build_extras.html#kokkos. The only thing that's potentially surprising is that you didn't need to provide your architecture to Kokkos, but probably Kokkos was able to auto-detect.

Can you explain what you are suggesting in more detail?

stargolike commented 5 days ago

It looks like things worked after you removed these architecture-related lines.

    -D Kokkos_ARCH_AMDAVX=ON \
    -D Kokkos_ARCH_AMPERE100=ON \

That's normal and expected - see these LAMMPS Kokkos docs for more details: https://docs.lammps.org/Build_extras.html#kokkos. The only thing that's potentially surprising is that you didn't need to provide your architecture to Kokkos, but probably Kokkos was able to auto-detect.

Can you explain what you are suggesting in more detail?

sry, I didn't express my suggestion very well. I think we should improve the tutorials on the web, taking into account the actual application scenarios to public.

wcwitt commented 5 days ago

The installation docs now say this

These instructions are for Cambridge-relevant machines and should be adapted as needed. In particular, take note of the architecture settings listed in the LAMMPS-Kokkos documentation (https://docs.lammps.org/Build_extras.html#kokkos).

The first sentence was there already, but I added the second. Do you think that is good enough?

stargolike commented 5 days ago

安装文档现在是这样说的

这些说明适用于剑桥相关机器,应根据需要进行调整。特别要注意 LAMMPS-Kokkos 文档 (https://docs.lammps.org/Build_extras.html#kokkos) 中列出的体系结构设置。

第一句话已经在那里了,但我添加了第二句话。你认为这足够好吗?

tks, dear developer, im happy that I can be of help.