Closed AiredaleDev closed 2 weeks ago
Please add performance results to the PR
Please clarify what this means
Tested w/o MPI, because I wanted to isolate effects of unified memory and because I couldn't run with MPI. The MPI build on RG for the Grace Hopper chip complains about being built w/o PMI-X, refusing to run.
Does regular MFC run without --unified
on that node? I believe it does
Please clarify what this means
Tested w/o MPI, because I wanted to isolate effects of unified memory and because I couldn't run with MPI. The MPI build on RG for the Grace Hopper chip complains about being built w/o PMI-X, refusing to run.
Does regular MFC run without
--unified
on that node? I believe it does
I was originally going to write this:
It runs fine with or without
--unified
-- it's just MPI that acts strange. I built and tested with--no-mpi
. When I build with--mpi
, independent of--unified
, I get this message:[rg-uwing-1:552224] OPAL ERROR: Unreachable in file ext3x_client.c at line 112 [TEST] MPI: call mpi_init(ierr) -------------------------------------------------------------------------- The application appears to have been direct launched using "srun", but OMPI was not built with SLURM's PMI support and therefore cannot execute. There are several options for building PMI support under SLURM, depending upon the SLURM version you are using: version 16.05 or later: you can use SLURM's PMIx support. This requires that you configure and build SLURM --with-pmix. Versions earlier than 16.05: you must use either SLURM's PMI-1 or PMI-2 support. SLURM builds PMI-1 by default, or you can manually install PMI-2. You must then build Open MPI using --with-pmi pointing to the SLURM PMI library location. Please configure as appropriate and try again. -------------------------------------------------------------------------- *** An error occurred in MPI_Init
I believe this is just a quirk of Rogue's Gallery.
but I just tried using mpirun
instead, in which it works just fine!
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 57.91%. Comparing base (
596ef8b
) to head (cddbbce
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Please add performance results to the PR
Added!
Description
Adds a new flag to the toolchain,
--unified
which passes-gpu=unified
to NVIDIA compilers.Fixes #412
Type of change
Scope
If you cannot check the above box, please split your PR into multiple PRs that each have a common goal.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Tested w/o MPI, because I wanted to isolate effects of unified memory and because I couldn't run with MPI. The MPI build on RG for the Grace Hopper chip complains about being built w/o PMI-X, refusing to run.It works fine withMPI
! I was just using slurm wrong.Test Configuration:
Checklist
docs/
)./mfc.sh format
before committing my codePerformance results
--case-optimization
,--unified
: 0.3247408701122950 ns/gp/eq/rhs--case-optimization
,--no-unified
: 0.3223139803168301 ns/gp/eq/rhs--case-optimization
,--unified
: 8.937425882157594 ns/gp/eq/rhs--case-optimization
,--no-unified
: N/A -- Out of memory.