Closed TheFloatingBrain closed 1 month ago
@TheFloatingBrain Could you please provide your Open MPI version? Pasting the output of ompi_info
could be a good start.
@wenduwan Thanks for the reply, I have to enter the following commands before I can run this command:
> source /etc/profile.d/
> module load mpi/openmpi-x86_64
According to this (the command is not recognized otherwise)
> ompi_info
Package: Open MPI mockbuild@02dc1f9e2ab145fdb212b01bdd462369
Open MPI: 5.0.2
Open MPI repo revision: v5.0.2
Open MPI release date: Feb 06, 2024
MPI API: 3.1.0
Ident string: 5.0.2
Prefix: /usr/lib64/openmpi
Configured architecture: x86_64-pc-linux-gnu
Configured by: mockbuild
Configured on: Mon Mar 4 00:00:00 UTC 2024
Configure host: 02dc1f9e2ab145fdb212b01bdd462369
Configure command line: '--prefix=/usr/lib64/openmpi'
'--disable-silent-rules' '--enable-builtin-atomics'
'--enable-ipv6' '--enable-mpi-java'
'--enable-mpi1-compatibility' '--enable-sphinx'
'--with-prrte=external' '--with-sge'
'--with-valgrind' '--enable-memchecker'
'--with-hwloc=/usr' '--with-libevent=external'
Built by: mockbuild
Built on: Mon Mar 4 00:00:00 UTC 2024
Built host: 02dc1f9e2ab145fdb212b01bdd462369
C bindings: yes
Fort mpif.h: yes (all)
Fort use mpi: yes (full: ignore TKR)
Fort use mpi size: deprecated-ompi-info-value
Fort use mpi_f08: yes
Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
limitations in the gfortran compiler and/or Open
MPI, does not support the following: array
subsections, direct passthru (where possible) to
underlying Open MPI's C functionality
Fort mpi_f08 subarrays: no
Java bindings: yes
Wrapper compiler rpath: runpath
C compiler: gcc
C compiler absolute: /bin/gcc
C compiler family name: GNU
C compiler version: 14.0.1
C++ compiler: g++
C++ compiler absolute: /bin/g++
Fort compiler: gfortran
Fort compiler abs: /bin/gfortran
Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::)
Fort 08 assumed shape: yes
Fort optional args: yes
Fort BIND(C) (all): yes
Fort TYPE,BIND(C): yes
Fort T,BIND(C,name="a"): yes
Fort PRIVATE: yes
Fort ABSTRACT: yes
Fort USE...ONLY: yes
Fort C_FUNLOC: yes
Fort f08 using wrappers: yes
Fort MPI_SIZEOF: yes
C profiling: yes
Fort mpif.h profiling: yes
Fort use mpi profiling: yes
Fort use mpi_f08 prof: yes
Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL support: yes,
OMPI progress: no, Event lib: yes)
Sparse Groups: no
Internal debug support: no
MPI interface warnings: yes
MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
dl support: yes
Heterogeneous support: no
MPI_WTIME support: native
Symbol vis. support: yes
Host topology support: yes
IPv6 support: yes
MPI extensions: affinity, cuda, ftmpi, rocm, shortfloat
Fault Tolerance support: yes
FT MPI support: yes
MCA accelerator: null (MCA v2.1.0, API v1.0.0, Component v5.0.2)
MCA allocator: basic (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA allocator: bucket (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA btl: self (MCA v2.1.0, API v3.3.0, Component v5.0.2)
MCA btl: ofi (MCA v2.1.0, API v3.3.0, Component v5.0.2)
MCA btl: sm (MCA v2.1.0, API v3.3.0, Component v5.0.2)
MCA btl: tcp (MCA v2.1.0, API v3.3.0, Component v5.0.2)
MCA btl: uct (MCA v2.1.0, API v3.3.0, Component v5.0.2)
MCA btl: usnic (MCA v2.1.0, API v3.3.0, Component v5.0.2)
MCA dl: dlopen (MCA v2.1.0, API v1.0.0, Component v5.0.2)
MCA if: linux_ipv6 (MCA v2.1.0, API v2.0.0, Component
MCA if: posix_ipv4 (MCA v2.1.0, API v2.0.0, Component
MCA installdirs: env (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA installdirs: config (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA memchecker: valgrind (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA memory: patcher (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA mpool: hugepage (MCA v2.1.0, API v3.1.0, Component v5.0.2)
MCA patcher: overwrite (MCA v2.1.0, API v1.0.0, Component
MCA rcache: grdma (MCA v2.1.0, API v3.3.0, Component v5.0.2)
MCA reachable: weighted (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA shmem: mmap (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA shmem: posix (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA shmem: sysv (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA smsc: cma (MCA v2.1.0, API v1.0.0, Component v5.0.2)
MCA threads: pthreads (MCA v2.1.0, API v1.0.0, Component v5.0.2)
MCA timer: linux (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA bml: r2 (MCA v2.1.0, API v2.1.0, Component v5.0.2)
MCA coll: adapt (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA coll: basic (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA coll: han (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA coll: inter (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA coll: libnbc (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA coll: self (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA coll: sync (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA coll: tuned (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA coll: ftagree (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA coll: monitoring (MCA v2.1.0, API v2.4.0, Component
MCA coll: sm (MCA v2.1.0, API v2.4.0, Component v5.0.2)
MCA fbtl: posix (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA fbtl: pvfs2 (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA fcoll: dynamic (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA fcoll: dynamic_gen2 (MCA v2.1.0, API v2.0.0, Component
MCA fcoll: individual (MCA v2.1.0, API v2.0.0, Component
MCA fcoll: vulcan (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA fs: pvfs2 (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA fs: ufs (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA hook: comm_method (MCA v2.1.0, API v1.0.0, Component
MCA io: ompio (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA io: romio341 (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA mtl: ofi (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA mtl: psm2 (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA op: avx (MCA v2.1.0, API v1.0.0, Component v5.0.2)
MCA osc: sm (MCA v2.1.0, API v3.0.0, Component v5.0.2)
MCA osc: monitoring (MCA v2.1.0, API v3.0.0, Component
MCA osc: rdma (MCA v2.1.0, API v3.0.0, Component v5.0.2)
MCA osc: ucx (MCA v2.1.0, API v3.0.0, Component v5.0.2)
MCA part: persist (MCA v2.1.0, API v4.0.0, Component v5.0.2)
MCA pml: cm (MCA v2.1.0, API v2.1.0, Component v5.0.2)
MCA pml: monitoring (MCA v2.1.0, API v2.1.0, Component
MCA pml: ob1 (MCA v2.1.0, API v2.1.0, Component v5.0.2)
MCA pml: ucx (MCA v2.1.0, API v2.1.0, Component v5.0.2)
MCA pml: v (MCA v2.1.0, API v2.1.0, Component v5.0.2)
MCA sharedfp: individual (MCA v2.1.0, API v2.0.0, Component
MCA sharedfp: lockedfile (MCA v2.1.0, API v2.0.0, Component
MCA sharedfp: sm (MCA v2.1.0, API v2.0.0, Component v5.0.2)
MCA topo: basic (MCA v2.1.0, API v2.2.0, Component v5.0.2)
MCA topo: treematch (MCA v2.1.0, API v2.2.0, Component
MCA vprotocol: pessimist (MCA v2.1.0, API v2.0.0, Component
Looking through my links, some of those resources are old, has OpenCL support been removed?
It seems hwloc
supports an OpenCL plugin, does OpenMPI?
I feel I should also perhaps explain what I am looking for, AFIK open mpi uses CUDA as a sort of "implementation" and calls out to the CUDA driver (or ROCm driver) to do actually code execution with generalized tasks. what I am wondering is: can the same thing be done with OpenCL driver or is there an extension to do so? Please correct me if my understanding is mistaken.
I imagine something similar happens on CPU's the underlying implementation might be pthreads or something.
Please see Issue #12831
Would be curious to know, can one send/recv from OpenCL device buffers?
Background information
I am trying to run MEEP on my GPU, I have both an Nvidia Card and an integrated Radeon card. Meep has the feature Parallel Meep, where scripts written using meep are launched through
. I have successfully done this on the CPU. However I would really like to speed things up using the GPU, and I would like to avoid using the proprietary nvidia driver for now on on linux, its high maintenance and taints my kernel. I would like to try and run meep on my Nvidia GPU using the open source Mesa OpenCL driver instead. I saw OpenMPI does seem to support OpenCL [0], [1], [2], [3] I would like to avoid edits to actual meep code, Parallel Meep seems to be chunked, which from what I read is a necessity to run the code on the GPU?It does seem possible to interface with the gpu through
, conda gave me this messageWhat version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
on fedora... conda? (See above)Please describe the system on which you are running
Details of the problem
Im sorry this is a bit of a noob question. I described the background information above, I simply am having difficulty figuring out how to use
with opencl. Does it have such an interface? Do I need to recompile meep? If so what should I do specifically? Is what I am trying to do possible?P.s getting both GPU's and the CPU in the game would be great as well, but that might be a separate question