Closed TheFloatingBrain closed 1 month ago
@TheFloatingBrain Could you please provide your Open MPI version? Pasting the output of ompi_info
could be a good start.
@wenduwan Thanks for the reply, I have to enter the following commands before I can run this command:
> source /etc/profile.d/modules.sh
> module load mpi/openmpi-x86_64
According to this (the command is not recognized otherwise)
> ompi_info
Package: Open MPI mockbuild@02dc1f9e2ab145fdb212b01bdd462369
Distribution
Open MPI: 5.0.2
Open MPI repo revision: v5.0.2
Open MPI release date: Feb 06, 2024
MPI API: 3.1.0
Ident string: 5.0.2
Prefix: /usr/lib64/openmpi
Configured architecture: x86_64-pc-linux-gnu
Configured by: mockbuild
Configured on: Mon Mar 4 00:00:00 UTC 2024
Configure host: 02dc1f9e2ab145fdb212b01bdd462369
Configure command line: '--prefix=/usr/lib64/openmpi'
'--mandir=/usr/share/man/openmpi-x86_64'
'--includedir=/usr/include/openmpi-x86_64'
'--sysconfdir=/etc/openmpi-x86_64'
'--disable-silent-rules' '--enable-builtin-atomics'
'--enable-ipv6' '--enable-mpi-java'
'--enable-mpi1-compatibility' '--enable-sphinx'
'--with-prrte=external' '--with-sge'
'--with-valgrind' '--enable-memchecker'
'--with-hwloc=/usr' '--with-libevent=external'
'--with-pmix=external'
Built by: mockbuild
Built on: Mon Mar 4 00:00:00 UTC 2024
Built host: 02dc1f9e2ab145fdb212b01bdd462369
C bindings: yes
Fort mpif.h: yes (all)
Fort use mpi: yes (full: ignore TKR)
Fort use mpi size: deprecated-ompi-info-value
Fort use mpi_f08: yes
Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
limitations in the gfortran compiler and/or Open
MPI, does not support the following: array
subsections, direct passthru (where possible) to
underlying Open MPI's C functionality
Fort mpi_f08 subarrays: no
Java bindings: yes
Wrapper compiler rpath: runpath
C compiler: gcc
C compiler absolute: /bin/gcc
C compiler family name: GNU
C compiler version: 14.0.1
C++ compiler: g++
C++ compiler absolute: /bin/g++
Fort compiler: gfortran
Fort compiler abs: /bin/gfortran
Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::)
Fort 08 assumed shape: yes
Fort optional args: yes
Fort INTERFACE: yes
Fort ISO_FORTRAN_ENV: yes
Fort STORAGE_SIZE: yes
Fort BIND(C) (all): yes
Fort ISO_C_BINDING: yes
Fort SUBROUTINE BIND(C): yes
Fort TYPE,BIND(C): yes
Fort T,BIND(C,name="a"): yes
Fort PRIVATE: yes
Fort ABSTRACT: yes
Fort ASYNCHRONOUS: yes
Fort PROCEDURE: yes
Fort USE...ONLY: yes
Fort C_FUNLOC: yes
Fort f08 using wrappers: yes
Fort MPI_SIZEOF: yes
C profiling: yes
Fort mpif.h profiling: yes
Fort use mpi profiling: yes
Fort use mpi_f08 prof: yes
Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL support: yes,
OMPI progress: no, Event lib: yes)
Sparse Groups: no
Internal debug support: no
MPI interface warnings: yes
MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
dl support: yes
Heterogeneous support: no
MPI_WTIME support: native
Symbol vis. support: yes
Host topology support: yes
IPv6 support: yes
MPI extensions: affinity, cuda, ftmpi, rocm, shortfloat
Fault Tolerance support: yes
FT MPI support: yes
MPI_MAX_PROCESSOR_NAME: 256
MPI_MAX_ERROR_STRING: 256
MPI_MAX_OBJECT_NAME: 64
MPI_MAX_INFO_KEY: 36
MPI_MAX_INFO_VAL: 256
MPI_MAX_PORT_NAME: 1024
MPI_MAX_DATAREP_STRING: 128
MCA accelerator: null (MCA v2.1.0, API v1.0.0, Component v5.0.2)
MCA allocator: basic (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA allocator: bucket (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA btl: self (MCA v2.1.0, API v3.3.0, Component v5.0.2)
MCA btl: ofi (MCA v2.1.0, API v3.3.0, Component v5.0.2)
MCA btl: sm (MCA v2.1.0, API v3.3.0, Component v5.0.2)
MCA btl: tcp (MCA v2.1.0, API v3.3.0, Component v5.0.2)
MCA btl: uct (MCA v2.1.0, API v3.3.0, Component v5.0.2)
MCA btl: usnic (MCA v2.1.0, API v3.3.0, Component v5.0.2)
MCA dl: dlopen (MCA v2.1.0, API v1.0.0, Component v5.0.2)
MCA if: linux_ipv6 (MCA v2.1.0, API v2.0.0, Component
v5.0.2)
MCA if: posix_ipv4 (MCA v2.1.0, API v2.0.0, Component
v5.0.2)
MCA installdirs: env (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA installdirs: config (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA memchecker: valgrind (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA memory: patcher (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA mpool: hugepage (MCA v2.1.0, API v3.1.0, Component v5.0.2)
MCA patcher: overwrite (MCA v2.1.0, API v1.0.0, Component
v5.0.2)
MCA rcache: grdma (MCA v2.1.0, API v3.3.0, Component v5.0.2)
MCA reachable: weighted (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA shmem: mmap (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA shmem: posix (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA shmem: sysv (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA smsc: cma (MCA v2.1.0, API v1.0.0, Component v5.0.2)
MCA threads: pthreads (MCA v2.1.0, API v1.0.0, Component v5.0.2)
MCA timer: linux (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA bml: r2 (MCA v2.1.0, API v2.1.0, Component v5.0.2)
MCA coll: adapt (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA coll: basic (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA coll: han (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA coll: inter (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA coll: libnbc (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA coll: self (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA coll: sync (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA coll: tuned (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA coll: ftagree (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA coll: monitoring (MCA v2.1.0, API v2.4.0, Component
v5.0.2)
MCA coll: sm (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA fbtl: posix (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA fbtl: pvfs2 (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA fcoll: dynamic (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA fcoll: dynamic_gen2 (MCA v2.1.0, API v2.0.0, Component
v5.0.2)
MCA fcoll: individual (MCA v2.1.0, API v2.0.0, Component
v5.0.2)
MCA fcoll: vulcan (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA fs: pvfs2 (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA fs: ufs (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA hook: comm_method (MCA v2.1.0, API v1.0.0, Component
v5.0.2)
MCA io: ompio (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA io: romio341 (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA mtl: ofi (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA mtl: psm2 (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA op: avx (MCA v2.1.0, API v1.0.0, Component v5.0.2)
MCA osc: sm (MCA v2.1.0, API v3.0.0, Component v5.0.2)
MCA osc: monitoring (MCA v2.1.0, API v3.0.0, Component
v5.0.2)
MCA osc: rdma (MCA v2.1.0, API v3.0.0, Component v5.0.2)
MCA osc: ucx (MCA v2.1.0, API v3.0.0, Component v5.0.2)
MCA part: persist (MCA v2.1.0, API v4.0.0, Component v5.0.2)
MCA pml: cm (MCA v2.1.0, API v2.1.0, Component v5.0.2)
MCA pml: monitoring (MCA v2.1.0, API v2.1.0, Component
v5.0.2)
MCA pml: ob1 (MCA v2.1.0, API v2.1.0, Component v5.0.2)
MCA pml: ucx (MCA v2.1.0, API v2.1.0, Component v5.0.2)
MCA pml: v (MCA v2.1.0, API v2.1.0, Component v5.0.2)
MCA sharedfp: individual (MCA v2.1.0, API v2.0.0, Component
v5.0.2)
MCA sharedfp: lockedfile (MCA v2.1.0, API v2.0.0, Component
v5.0.2)
MCA sharedfp: sm (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA topo: basic (MCA v2.1.0, API v2.2.0, Component v5.0.2)
MCA topo: treematch (MCA v2.1.0, API v2.2.0, Component
v5.0.2)
MCA vprotocol: pessimist (MCA v2.1.0, API v2.0.0, Component
v5.0.2)
Looking through my links, some of those resources are old, has OpenCL support been removed?
It seems hwloc
supports an OpenCL plugin, does OpenMPI?
https://github.com/open-mpi/hwloc/issues/641 https://www-lb.open-mpi.org/projects/hwloc/doc/v2.11.1/a00356.php
I feel I should also perhaps explain what I am looking for, AFIK open mpi uses CUDA as a sort of "implementation" and calls out to the CUDA driver (or ROCm driver) to do actually code execution with generalized tasks. what I am wondering is: can the same thing be done with OpenCL driver or is there an extension to do so? Please correct me if my understanding is mistaken.
I imagine something similar happens on CPU's the underlying implementation might be pthreads or something.
Please see Issue #12831
Would be curious to know, can one send/recv from OpenCL device buffers?
Background information
I am trying to run MEEP on my GPU, I have both an Nvidia Card and an integrated Radeon card. Meep has the feature Parallel Meep, where scripts written using meep are launched through
mpirun
. I have successfully done this on the CPU. However I would really like to speed things up using the GPU, and I would like to avoid using the proprietary nvidia driver for now on on linux, its high maintenance and taints my kernel. I would like to try and run meep on my Nvidia GPU using the open source Mesa OpenCL driver instead. I saw OpenMPI does seem to support OpenCL [0], [1], [2], [3] I would like to avoid edits to actual meep code, Parallel Meep seems to be chunked, which from what I read is a necessity to run the code on the GPU?It does seem possible to interface with the gpu through
mpirun
, conda gave me this messageWhat version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
Using
dnf
on fedora... conda? (See above)Please describe the system on which you are running
Details of the problem
Im sorry this is a bit of a noob question. I described the background information above, I simply am having difficulty figuring out how to use
mpirun
with opencl. Does it have such an interface? Do I need to recompile meep? If so what should I do specifically? Is what I am trying to do possible?P.s getting both GPU's and the CPU in the game would be great as well, but that might be a separate question