Closed anandrdbz closed 6 months ago
Conflicts need to be resolved
Aside from the test suite, the benchmarks are also failing on GPU:
Case Pre Process Simulation Post Process
──────────────────────────────────────────────────────────────────
viscous_weno5_sgb_mono 1.00x N/A N/A
5eq_rk3_weno3_hllc 0.50x 0.98x 1.33x
ibm 1.00x 1.04x 1.00x
hypo_hll 1.00x N/A N/A
Change ./mfc.sh load
compute name from Crusher to Frontier
Update: Did this myself in https://github.com/MFlowCode/MFC/pull/368/commits/110a290dc1d744dfe7ca7c7387c6d16b43c07d37
./mfc.sh test -a -- -c frontier
does not work.
Specifically:
FileNotFoundError: [Errno 2] No such file or directory:
'/lustre/orion/cfd154/scratch/sbryngelson/MFC/build/install/dependencies/bin/h5d
ump'
and
sbryngelson/scratch $ ls MFC/build/install/dependencies/bin/
hipfc
It's looking like Frontier CI may fail for the 2-rank case. Tests were run with
./mfc.sh test -j 8 -- -c frontier
The test MFC.sh file in the 2-rank directory reads
(set -x; srun -N 1 -n 2 "/lustre/orion/cfd154/scratch/sbryngelson/runner/actions-runner/_work/MFC/MFC/build/install/0571538fd2/bin/simulation")
which appears to be the problem, it should be passing --ntasks-per-node
(or whatever) since we are using -- -c frontier
Update: It passed on second try 🤷
@henryleberre, do you know why it doesn't build h5dump? (or at least it isn't found in the expected bin/
directory)
@sbryngelson We opted not to build HDF5 on CCE. I forget why, perhaps there were some incompatibilities. We use the cray-hdf5 module so h5dump should already be available.
@henryleberre you are correct, h5dump
is already in the path. It looks like the problem is that using test -a
forces it to look in dependencies/bin/h5dump
for the binary (rather than the path broadly). Is there a fix for this?
Here:
./mfc/test/test.py: h5dump = f"{HDF5.get_install_dirpath()}/bin/h5dump"
It does look like we have this option:
if ARG("no_hdf5"):
if not does_command_exist("h5dump"):
raise MFCException("--no-hdf5 was specified and h5dump couldn't be found.")
h5dump = shutil.which("h5dump")
though it doesn't seem to be working like this
./mfc.sh test -a j 1 -- -c frontier --no-hdf5
@sbryngelson I'm testing a fix. For your command, you would have to use this instead:
$ ./mfc.sh test -a --no-hdf5 -- -c frontier
@henryleberre this works!
closes #352 #383 #384
Description
Adds support for MI200+ GPUs via CCE compilers and OpenACC.
Type of change
Please delete options that are not relevant.
Scope
Closes #352 #383 #384
Test Configuration: