Closed SyntaxSmith closed 3 days ago
现在cuda版我观察到的有 1 好像多卡没用上,体系比较大的时候Mpirun -n 不能开大,所有进程占用显存会挤在第一张卡上 2 docker版似乎不能用OMP多线程,进程cpu占用率长期是100%,手动指定环境变量也无效,但是多进程会爆显存 3 conda版会报oom,xxmr2d:out of memory
No response
以上所有bug都是预编译版,我在请运维本地编译,我会在编译完成后重新测试。
Thanks for your advice, would you like to propose a new issue on deepmodeling/abacus?(https://github.com/deepmodeling/abacus-develop) We will discuss issues there.
Describe the bug
现在cuda版我观察到的有 1 好像多卡没用上,体系比较大的时候Mpirun -n 不能开大,所有进程占用显存会挤在第一张卡上 2 docker版似乎不能用OMP多线程,进程cpu占用率长期是100%,手动指定环境变量也无效,但是多进程会爆显存 3 conda版会报oom,xxmr2d:out of memory
Expected behavior
No response
To Reproduce
No response
Environment
No response
Additional Context
以上所有bug都是预编译版,我在请运维本地编译,我会在编译完成后重新测试。
Task list for Issue attackers (only for developers)