Closed MatthAlex closed 2 years ago
Thanks for the report. This sounds like it might be an MPI issue. I'll forward this report to the Intel MPI team for discussion.
@MatthAlex You may try to disable RMA path with the counters on IMPI 2019 level: MPIR_CVAR_CH4_OFI_ENABLE_RMA=0
BTW, IMPI 2019 U6 is publicly available.
Tried exporting MPIR_CVAR_CH4_OFI_ENABLE_RMA=0
on runtime, but unfortunately:
[0] [0] MPI startup(): libfabric provider: shm
[0] libfabric:24107:core:core:fi_fabric_():1152<info> Opened fabric: shm
[0] libfabric:24107:core:core:fi_param_get_():280<info> variable universe_size=<not set>
[0] libfabric:24107:shm:av:util_av_init():455<info> AV size 1024
[0] libfabric:24107:shm:av:smr_map_to_region():174<warn> shm_open error
[0] libfabric:24107:shm:av:smr_map_to_region():174<warn> shm_open error
[0] Abort(1091471) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
[0] MPIR_Init_thread(703)........:
[0] MPID_Init(923)...............:
[0] MPIDI_OFI_mpi_init_hook(1335): OFI get address vector map failed
[1] libfabric:26209:shm:av:smr_map_to_region():174<warn> shm_open error
[1] libfabric:26209:shm:av:smr_map_to_region():174<warn> shm_open error
shm-2ranks1node.txt Additionally, running two ranks over two nodes with round robin pinning failed in the same way: shm-2ranks2nodes-rr.txt
Do you think update 6 will make any difference? Is it a hard suggestion?
@MatthAlex Thanks! We are analyzing what kind of issue you have faced with. (the "shm_open error") Looks like IMPI 2019 U6 won't help here. We will let you know as soon as we get somethig here. Please note that Intel MPI relies on own shm transport as primary one.
@MatthAlex I'm still chasing down one bug to get this working properly but could you try running this with master and see if your issues are mostly fixed?
@aingerson I can confirm that master branch resolved the shm error, checked with 2 and 16 ranks.
On a tangent, erroneously, I did a test with two nodes. Without the shm
isn't supposed to run on >1 nodes. I contrast this to the Intel MPI 2018 behaviour, where the failure is explicit, without the need of I_MPI_DEBUG
I believe.
[19] Abort(1615759) on node 19 (rank 19 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
[19] MPIR_Init_thread(703)........:
[19] MPID_Init(923)...............:
[19] MPIDI_OFI_mpi_init_hook(1287):
[19] MPIDU_bc_table_create(344)...: Missing hostname or invalid host/port description in business card
Additional testing has produced another issue. I'm logging it here for the time being since I tested verbs
and worked, but didn't on shm
.
During testing of Fortran coarrays, when the second rank happened to be pinned on the same package, the program was killed. If the two ranks is separate there is no issue. That means it doesn't work for >2 ranks, since a 3rd will always be placed on some preoccupied package. Attached are the logs from both runs and the Fortran program. shm-coarray-test.txt coarray.txt (change .txt to .f90) Edit: If you think this warrants a new Issue I could move it.
@MatthAlex Thanks for the update. Could you try testing again with the changes in #5503 ? I finished chasing down a couple bugs and now the provider is working with the Intel MPI benchmarks up to 32 ranks.
@aingerson Testing has been successful. shm
ran on 2 and 16 ranks (our higher-core nodes are still lacking IPoIB so I can't test).
When FI_PROVIDERS="shm,verbs"
, verbs
is chosen instead. Performance seems margin-of-error close. Is that the intended or expected outcome? Should shm
be forced on single nodes?
As for the coarray issue, there was no progress with #5503 . I feel that it might be a separate issue.
Thanks for testing again.
I don't think FI_PROVIDERS
will do anything. It should be FI_PROVIDER
Right now there is no way to layer the shm provider with another provider (such as verbs). That is a work in progress.
What it sounds like is happening when you set FI_PROVIDERS="shm,verbs"
is that it does not register that and instead just picks verbs for all communication which is why you're seeing similar performance.
As for the coarray issue, yeah it seems separate. I can look into it. To help in isolating the issue, could you run it again with
I_MPI_DEBUG=1
MPIR_CVAR_CH4_OFI_ENABLE_RMA=0
MPIR_CVAR_CH4_OFI_ENABLE_ATOMICS=0
and attach the debug log?
Of course you are right. I meant FI_PROVIDER="shm,verbs"
, which is what I've used for testing. So, no layering between the two providers means that verbs
will be picked first if at all available, then shm
?
On the coarray tests, I'm attaching a log with the variables you mentioned and one with I_MPI_DEBUG=6
and FI_LOG_LEVEL=debug
added, for the same exact run. It runs to completion successfully.
shm-test.txt
shm-test-lite.txt
So, no layering between the two providers means that
verbs
will be picked first if at all available, thenshm
?
Right, if you include verbs in the FI_PROVIDER
list, it will pick up verbs since it is the highest ranking provider in that list. The order it will pick from should be the same order as the output of fi_info
Really the only way to run shm is to explicitly request it using FI_PROVIDER=shm
or set hints->fabric_attr->prov_name="shm"
in hints. Once it is able to be layered with core providers, this will change.
Thanks for the updated test runs This looks like an issue with atomics. I'll look into it and keep you updated!
Hello, I'm resurrecting this issue with new details. Testing Intel MPI 2020 update 1 with gcc 6.3, 7.1, 8.2, and libfabric v1.9.1, had the same exact error resurface.
libfabric:4443:core:core:fi_fabric_():1163<info> Opened fabric: shm
libfabric:4443:shm:cntr:smr_cntr_open():42<info> cntr wait not yet supported
Abort(1091215) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(136)........:
MPID_Init(904)...............:
MPIDI_OFI_mpi_init_hook(1210): OFI event counter create failed (ofi_init.c:1210:MPIDI_OFI_mpi_init_hook:Function not implemented)
Not only that, but passing the MPIR_CVAR_CH4_OFI_ENABLE_RMA=0
and MPIR_CVAR_CH4_OFI_ENABLE_ATOMICS=0
fixes the issue.
@MatthAlex Thanks for resurrecting the issue!
The 1.9 branch does not have the new wait object needed for shm which Intel MPI needs to enable RMA and atomics. 1.10 should have this and works for me. Could you give that a try?
@aingerson I can confirm that it does work!
I can also confirm that disabling ATOMICS and RMA is still needed for Coarray Fortran to work as expected. That hasn't change with Intel Par. XE 2020 update1.
Great news that at least it can run! Can you provide a reproducer for the issue? I can't seem to recreate it to debug the atomics/rma issue?
Reproducing the issue proved to be a bit more involved than I thought.
CAF_verbs_failure.txt CAF_shm_pass.txt
The issue could only be reproduced for verbs
rather than shm
. shm
does produce some weird "info" for atomics, however the exit code is 0 nonetheless.
The shm warning I think is ok. I believe MPI queries/tests out the different atomic op/datatype combinations to see what it can support and OFI produces a warning whenever it doesn't support a certain combination. If the exit code is 0 then I think it's ok. Not sure about the verbs error. Could you point me to the error line in the CAF_verbs_failure.txt log? I don't see the problem there. If you are seeing a problem with verbs, I would suggest trying it with the MR cache turned off since we have found some issues withe it. To do this, set FI_MR_CACHE_MAX_COUNT=0.
Oh, my bad.. I bundled the successful with the failed logs on CAF_verbs_failure. The failure took place on 389-393. Edit: As an update, MR cache off still fails on verbs.
This issue is stale because it has been open 360 days with no activity. Remove stale label or comment, otherwise it will be closed in 7 days.
Forcing shm to be opened on an MPI single-node run returns the following error:
shm man pages do mention
However, Intel MPI 2018.3 will correctly work with
shm
, if forced to. Defaults to usingverbs
instead if enabled.Tested with v1.9.0 (and rc1,2,3), intel MPI 2018.3 and 2019.5.
Configuration used:
Variables used: