Open yurivict opened 2 months ago
Please provide all the information from the debug issue template; thanks!
https://github.com/open-mpi/ompi/blob/main/.github/ISSUE_TEMPLATE/bug_report.md
I added missing bits of information.
the root cause could be not enough available space in /tmp
(unlikely per your description) or something went wrong when checking the size.
try running
env OMPI_MCA_shmem_base_verbose=100 ./hello-world-1
and check the output (useful message might have been compiled out though)
if there is nothing useful, you can
strace -o hw.strace -s 512 ./hello-world-1
then compress hw.strace
and upload it.
env OMPI_MCA_shmem_base_verbose=100 ./hello-world-1
This didn't produce anything relevant.
strace -o hw.strace -s 512 ./hello-world-1
BSDs have ktrace instead. Here is the ktrace dump: https://freebsd.org/~yuri/openmpi-kernel-dump.txt
51253 hello-world-1 CALL fstatat(AT_FDCWD,0x1b0135402080,0x4c316d20,0)
51253 hello-world-1 NAMI "/tmp/ompi.yv.0/jf.0/2909405184"
51253 hello-world-1 RET fstatat -1 errno 2 No such file or directory
51253 hello-world-1 CALL open(0x1b0135402080,0x120004<O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC>)
51253 hello-world-1 NAMI "/tmp/ompi.yv.0/jf.0/2909405184"
51253 hello-world-1 RET open -1 errno 2 No such file or directory
It looks like some directories were not created.
what if you mpirun -np 1 ./hello-world-1
instead?
sudo mpirun -np 1 ./hello-world-1
prints the same error message:
It appears as if there is not enough space for /dev/shm/sm_segment.yv.0.9f060000.0 (the shared-memory backing
file). It is likely that your MPI job will now either abort or experience
performance degradation.
The log doesn't have any mkdir
operations, so that "/tmp/ompi.yv.0" was never created.
well, this is a different message that the one used when opening this issue. And this one is self explanatory.
Anyway, what if you
env OMPI_MCA_shmem_mmap_backing_file_base_dir=/tmp ./helloworld-1
or you can simply increase the size of /dev/shm
sudo OMPI_MCA_shmem_mmap_backing_file_base_dir=/tmp ./hello-world-1
produces the same error messages.
This message is for a regular user:
$ OMPI_MCA_shmem_mmap_backing_file_base_dir=/tmp ./hello-world-1
[yv.noip.me:88431] shmem: mmap: an error occurred while determining whether or not /tmp/ompi.yv.1001/jf.0/1653407744/sm_segment.yv.1001.628d0000.0 could be created.
> Hello world from processor yv.noip.me, rank 0 out of 1 processors (pid=88431)
< Hello world from processor yv.noip.me, rank 0 out of 1 processors (pid=88431)
This message is for root:
# OMPI_MCA_shmem_mmap_backing_file_base_dir=/tmp ./hello-world-1
--------------------------------------------------------------------------
It appears as if there is not enough space for /dev/shm/sm_segment.yv.0.ee540000.0 (the shared-memory backing
file). It is likely that your MPI job will now either abort or experience
performance degradation.
Local host: yv
Space Requested: 16777216 B
Space Available: 1024 B
--------------------------------------------------------------------------
> Hello world from processor yv.noip.me, rank 0 out of 1 processors (pid=88929)
< Hello world from processor yv.noip.me, rank 0 out of 1 processors (pid=88929)
I see.
try adding OMPI_MCA_btl_sm_backing_directory=/tmp
and see how it works
The error messages disappear when OMPI_MCA_btl_sm_backing_directory=/tmp is used.
We have seen and responded to this problem many times - I believe it is included in the docs somewhere. The problem is that BSD (mostly as seen on Mac) has created a default TMPDIR
that is incredibly long. So when we add our tmpdir prefix (to avoid stepping on other people's tmp), the result is longer than the path length limits.
Solution: set TMPDIR
in your environment to point to some shorter path, typically something like $HOME/tmp
.
[...] a default TMPDIR that is incredibly long [...]
What do you mean by TMPDIR? In our case TMPDIR is just /tmp.
Indeed, it seems the root cause is something fishy related to /dev/shm
what if you
df -h /dev/shm
both as a user and root?
$ df -h /dev/shm
Filesystem Size Used Avail Capacity Mounted on
devfs 1.0K 0B 1.0K 0% /dev
# df -h /dev/shm
Filesystem Size Used Avail Capacity Mounted on
devfs 1.0K 0B 1.0K 0% /dev
That's indeed a small /dev/shm
.
I still do not understand why running as a user does not get you the user friendly message you get when running as root.
can you ktrace
as a non-root user so we can figure out where the failure occurs?
It seems regular users do not have write access to the (small size) /dev/shm
and we do not display a friendly error message about it.
45163 hello-world-1 CALL access(0x4e3d8d33,0x2<W_OK>)
45163 hello-world-1 NAMI "/dev/shm"
45163 hello-world-1 RET access -1 errno 13 Permission denied
Unless you change that, your best bet is probably to add
btl_sm_backing_directory=/tmp
to your $PREFIX/etc/openmpi-mca-params.conf
Is direct access to /dev/shm new in OpenMPI? It used to work fine on FreeBSD.
How does this work on Linux? Is everybody allowed write access to /dev/shm there?
Access to /dev/shm has fallback in ompi, like here.
Why doesn't this fallback work then? Is it accidentally missing in some cases?
I believe I've tried everything suggested (and then some) as evidenced by the following interactions:
(ioniser) jabowery@jaboweryML:~/devel/ioniser$ printenv |grep BulkData|grep tmp
OMPI_MCA_shmem_mmap_backing_file_base_dir=/mnt/BulkData/home/jabowery/tmp
btl_sm_backing_directory=/mnt/BulkData/home/jabowery/tmp
TMPDIR=/mnt/BulkData/home/jabowery/tmp
(ioniser) jabowery@jaboweryML:~/devel/ioniser$ tail /home/jabowery/mambaforge/envs/ioniser/etc/openmpi-mca-params.conf
# See "ompi_info --param all all --level 9" for a full listing of Open
# MPI MCA parameters available and their default values.
pml = ^ucx
osc = ^ucx
coll_ucc_enable = 0
mca_base_component_show_load_errors = 0
opal_warn_on_missing_libcuda = 0
opal_cuda_support = 0
btl_sm_backing_directory=/mnt/BulkData/home/jabowery/tmp
(ioniser) jabowery@jaboweryML:~/devel/ioniser$ tail /etc/openmpi/openmpi-mca-params.conf
btl_base_warn_component_unused=0
# Avoid openib an in case applications use fork: see https://github.com/ofiwg/libfabric/issues/6332
# If you wish to use openib and know your application is safe, remove the following:
# Similarly for UCX: https://github.com/open-mpi/ompi/issues/8367
mtl = ^ofi
btl = ^uct,openib,ofi
pml = ^ucx
osc = ^ucx,pt2pt
btl_sm_backing_directory=/mnt/BulkData/home/jabowery/tmp
(ioniser) jabowery@jaboweryML:~/devel/ioniser$ !p
p ioniser.py
[jaboweryML:34571] shmem: mmap: an error occurred while determining whether or not /mnt/BulkData/home/jabowery/tmp/ompi.jaboweryML.1000/jf.0/121765888/shared_mem_cuda_pool.jaboweryML could be created.
[jaboweryML:34571] create_and_attach: unable to create shared memory BTL coordinating structure :: size 134217728
(ioniser) jabowery@jaboweryML:~/devel/ioniser$ whoami
jabowery
ioniser) jabowery@jaboweryML:~/devel/ioniser$ touch /mnt/BulkData/home/jabowery/tmp/accesstest.txt
(ioniser) jabowery@jaboweryML:~/devel/ioniser$ ls -altr /mnt/BulkData/home/jabowery/tmp/accesstest.txt
-rw-rw-r-- 1 jabowery jabowery 0 Nov 1 10:51 /mnt/BulkData/home/jabowery/tmp/accesstest.txt
(ioniser) jabowery@jaboweryML:~/devel/ioniser$ df /mnt/BulkData/home/jabowery/tmp
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/nvme1n1 1921725720 692366840 1131666768 38% /mnt/BulkData
(ioniser) jabowery@jaboweryML:~/devel/ioniser$
See the program below.
---program---
Version: openmpi-5.0.5_1 Describe how Open MPI was installed: FreeBSD package Computer hardware: Intel CPU Network type: Ethernet/IP (irrelevant) Available space in /tmp: 64GB FreeBSD 14.1