Closed bozbez closed 3 years ago
From initial (read as brief !) inspection, I am ok with this. A vast improvement to what we already have.
However, I think we might need to pull request this to merge to a "develop" branch before directly merging to master, just so that we have a working master branch for the ongoing Hydra work. This way we can approve PRs to develop and once all the TODOs, test script runs, and tests with OP2-Hydra passes we can merge to master. Thus also leading the way to a "Release".
What do you think ?
I agree that it makes sense not to try to merge this directly if there is stability required for current Hydra work, but I'm not so sure about a "develop" branch - I think the more common pattern here would be to instead create a release branch for the Hydra work, which can then have changes cherry-picked from the master branch. Doing it like this avoids having to switch the develop branch with the master or trying to merge all of the commits from the develop branch. I am not entirely sure what is the right answer though - still a bit out of the loop regarding what work is happening/planned.
Regarding the TODOs and initial tests, I think those should be done directly on the feature/unified-make branch and then the PR updated, rather than merging into a master or develop branch and leaving that in a potentially dysfunctional state.
Ok, fine. I am happy to go for a release/OP2-Hydra-V1.0 branch which will basically be the current master that works with the RR Hydra code. However once the current planned changes to OP2 is merged and we have a proper OP2 release, we need to make sure that OP2Hydra works with this release. Afterwards we need to make sure new releases do not break OP2-Hydra as RR will be looking to work with it.
With regard to what work is planned, @bozbez you are actually in the loop. So the work discussed with the RR team at the recent meeting (moving OP2-Hydra to Hydra V20) will be the major piece of work. This will need to be done alongside the TODO list in #220
Come to Think of it some of the tasks in #220 may be best handled separately form #219. For e.g. the update to code generator and SYCL/HIP paralleizations would be a bit involved. So perhaps we only do the relevant tasks from #220 in this PR ?
Come to Think of it some of the tasks in #220 may be best handled separately form #219. For e.g. the update to code generator and SYCL/HIP paralleizations would be a bit involved. So perhaps we only do the relevant tasks from #220 in this PR ?
Agreed, that was my intention at least - this PR is already a little larger than I would like.
Seems that NVFORTRAN really has a few too many issues with parallel building to bother supporting it... the source of the issue is that it produces .o
files in the directory that it was invoked in even if it is doing a combined compile link. The immediate solution to try and separate the .o
compilation and linking doesn't work due to FORTRAN's module system. Attempts to fix that by generating the module files first with -Msyntax-only
fail as there is no way to inhibit module file generation during compilation to .o
, leading to a race hazard where one compiler instance is re-writing a module file with new content while another attempts to read it.
The only working solution that I found was to cd
into a build directory and then invoke the compiler with adjusted paths, but this is rather messy. Since the builds of these apps is not particularly time consuming and no other FORTRAN compiler has this issue I've added a F_HAS_PARALLEL_BUILDS
feature flag to the compiler Makefiles which simply disables parallel compilation (via .NOTPARALLEL:
) for any misbehaving compilers.
Worth noting also that we narrowly avoid this issue with the parallel build of the OP2 static libraries due to the only module inter-dependency being on op2_for_declarations.F90
, which we manually specify as a make dependant for all the other .F90
files.
Not having Fortran parallel build is ok for now. I think we have bigger fish to fry ! However it wold be interesting to see this on Hydra. RR probably would not have tested their Hydra build with PGI/NV compilers, but we will have to due to cudafortran. Although I suspect the dependencies set up in the Hydra Makefiles may be managing this issue. I have not had trouble with nv_hpc compilers (except for internal compiler bugs .. thats a different issue).
Two new environment variables added: HDF5_SEQ_INSTALL_PATH
and HDF5_PAR_INSTALL_PATH
that allow specification of both the parallel and sequential HDF5 libraries. If only the old HDF5_INSTALL_PATH
is specified, then the flavour is auto-detected (as in #218) and the appropriate _SEQ_
or _PAR_
variant is set. If one of the variants is missing then the apps that depend on it will not be built.
HDF5 is now also linked as a shared library, as NVFORTRAN
doesn't seem to handle the -l:libhdf5.a
syntax correctly. The library directory from the INSTALL_PATH
is now added to the rpath for convenience to avoid having to set LD_LIBRARY_PATH
.
Disabling the MPI C++ bindings for MPT turned out to be non-obvious - there doesn't seem to be an official flag or environment variable to do this, so I ended up going with -DMPIPP_H
to manually stop the header include which seems a bit hacky. One thing to consider for the future might be to drop the mpiXX
compiler wrappers and just specify the include/link directories to the compiler.
Thanks @bozbez . We definitely need all of these flags documented as we are getting more and more of it !
Do you have a preference to where that should go? Currently there are old build instructions in op2/README
and apps/c/README
but since a lot of the variables are now shared it might make more sense to add them in the top level README
for now, until the readthedocs stuff comes around.
Might also be a good opportunity also to markdown-ify the README
à la OPS.
@bozbez yes a toplevel README will do. Take a look at the README in OPS https://github.com/OP-DSL/OPS and its readthedocs https://ops-dsl.readthedocs.io/
It would be good to update the OPS READMEs and make both OPS and OP2 consistent as well (so we have a unified style / rules for both these repos). Any good things you do with OP2 we like to do in OPS as well !
Gave the OpenMP4 offloading support a go with the NVHPC compilers which have offloading support by default. With a bit of tweaking the C non-mpi openmp4 variant builds, but for some reason isn't detecting and using the GPUs at least on Cirrus (and with no way to specify or control this). The FORTRAN openmp4 variants don't build due to an issue with the translator; the user kernel functions are missing a !$ omp declare target
annotation. The MPI variants also fail to build, due to lack of runtime library support - given that these weren't built prior to this PR it is to be expected.
For now I've excluded all but the C non-mpi openmp4 variant from the all
target, and added a TODO/openmp4
comment to some of the places that need some attention to get the full support working, but I think it is worth considering if this support should be in the master branch as it doesn't seem particularly ready. It might also be worth considering if it is wanted at all given that we might be pulling in the SYCL support soon.
Makes sense! OpenMP4 will be interesting on the Fortran side, as it's the only way to run on AMD GPUs from Fortran...
@bozbez , README looking good. There is a bunch of source files under the OP2-Common/scripts directory. Some of them are old, but we should retain at least one for each compiler. Unless there is an argument for not keeping these (@reguly ?) . On our HPC nodes we have latest Intel and nv_hpc setups and sourcefiles which I will commit to this branch, if so.
So I was kinda hoping that we would be able to incorporate anything worthwhile from the scripts
directory as one of the new profiles in more of a structured manner... the problem is that a lot of them reference actual user directories and the like which I don't really think is appropriate to put in this repository. See makefiles/profiles/cirrus-intel-nvhpc.mk
for an example of something script-like as a profile.
There are a couple of issues that come to mind though - while the profiles work great for setting up compiler variables and environment paths they are still just makefiles, which means running stuff like source ...
and module load ...
won't work, and even if they did running them on every make
invocation is almost certainly a bad idea.
Personally I think that anything running setup with module load
s and etc should be written as a convenience by the user for their specific environment if they feel they need such a thing - otherwise we end up committing files that are specific really only to one user on one machine and are of minimal use as anything but a rough example for other people. These are also liable to be left unmaintained and unused, with no clear point of clean up...
@bozbez I can certainly see your point. I can see similarities with the source files with what you have in makefiles/profiles/cirrus-intel-nvhpc.mk, Now in it we have things like NVCC :=/lustre/sw/nvidia/hpcsdk-219/Linux_x86_64/21.9/compilers/bin/nvcc which surely will be a module load on Cirrus, then simply using nvcc will work for compilation. Given as you say module loads should not be used in a Makefile, are these "profiles" useful at all ? A user could think that something that can be built with a simple module load needs the absolute paths set to get OP2 working. On HPC clusters sometimes finding these paths can be quite tricky. Besides if Cirrus changes these locations then that would be again problematic in terms of utility and maintenance. In which case we should altogether not have such profiles.
@reguly any comment on this ? I am sort of leaning towards the simple source files we have already - just to showcase the setting up of the environment. But happy to do otherwise.
I never really used the source files committed to these repos, they were always hardcoded to your setup :) so I am fine without having those committed in as @bozbez suggests
@reguly actually I was more considering whether its useful to a user to see one of these source files. Of course the hard coded setup in the profiles also has the same issue.
I don't think that is particularly relevant, as all the environment variables a properly documented.
@reguly ok. great. Lets leave it at that. I am happy with this.
@bozbez Thanks, your most recent push fixed the build issue and Hydra is subsequently running the basic Rotor 37 problem correctly. Continuing testing with the coupler now.
@bozbez I can certainly see your point. I can see similarities with the source files with what you have in makefiles/profiles/cirrus-intel-nvhpc.mk, Now in it we have things like NVCC :=/lustre/sw/nvidia/hpcsdk-219/Linux_x86_64/21.9/compilers/bin/nvcc which surely will be a module load on Cirrus, then simply using nvcc will work for compilation. Given as you say module loads should not be used in a Makefile, are these "profiles" useful at all ? A user could think that something that can be built with a simple module load needs the absolute paths set to get OP2 working. On HPC clusters sometimes finding these paths can be quite tricky. Besides if Cirrus changes these locations then that would be again problematic in terms of utility and maintenance. In which case we should altogether not have such profiles.
So the reason the nvcc
executable is hardcoded there is because of the Cirrus MPI module struggling to sort out it's paths when there was more than one compiler loaded - in particular if you load intel-compilers-19
and then nvidia/nvhpc
then you ended up with a broken mpicxx
. Since the compilers on Cirrus are manually inserting rpaths into /lustre/sw
already so that you don't need module loads to run previously compiled executables, I think that hardcoding the path here is an acceptable workaround.
I think its also worth noting the difference between hardcoding paths to shared resources on known clusters and hardcoding paths into a user folder which others can't access.
The intention of the profiles, and the reason they are there instead of source files, is that they allow the user to set make variables after the "configuration" of the compilers and the libraries. This allows you to easily override or the append to e.g. CXXFLAGS
and HDF5_LIB
, whereas source files would only immediately allow you to override the variables. The solution here for the source files would be to add _EXTRA
variables to the end of each configure variable (which might still be worth doing). Even with the _EXTRA
variables though, the profiles still have the advantage of being able to do conditional environment setup using ifdef
and the lot.
Source files still have a place, but rather than using them for cluster common setup like -march=
and compiler paths they should be used as a shortcut to set OP2_PROFILE
and X_INSTALL_PATH
as well as doing the module loads. However I think rather than committing them to the repo, it is reasonable to expect the user to produce these in a directory above as a shortcut to manually setting the variables each time - as long as the variables are clearly documented.
I have tested OP2-Hydra and OP2-Hydra with the coupler with this new OP2 branch. All tests passes. We should get Arun to switch to this version, when we are ready to merge this pull request.
So unfortunately I haven't been able to actually get an OpenMP 4.0 offloading build to offload - on Cirrus with the NVHPC SDK compilers the binaries fail to find the GPU, and on my new work machine with GCC built for offloading the same happens but perhaps this time it might be that the GCC offloading doesn't support Ampere. The offload compilers are being invoked however, and the PTX is generated, but no luck at runtime.
Since the builds complete and compiler flags seem fine and are equivalent to the old build system I am going to mark this task as complete despite not being able to actually run the binaries.
In general I think this is pretty much ready to merge now, there are perhaps a few things that might need tweaking over time (and perhaps for Spack) but that should not pose a significant problem given the level of flexibility in the unified system.
Which software stack did you use on Cirrus? The 21.2 had problems with not fining the GPU, but the latest works. Also, I just got an email from Cirrus support earlier today, that 21.2 should be fixed too.
On 2021. Dec 1., at 13:04, Joe Kaushal @.***> wrote:
So unfortunately I haven't been able to actually get an OpenMP 4.0 offloading build to offload - on Cirrus with the NVHPC SDK compilers the binaries fail to find the GPU, and on my new work machine with GCC built for offloading the same happens but perhaps this time it might be that the GCC offloading doesn't support Ampere. The offload compilers are being invoked however, and the PTX is generated, but no luck at runtime.
Since the builds complete and compiler flags seem fine and are equivalent to the old build system I am going to mark this task as complete despite not being able to actually run the binaries.
In general I think this is pretty much ready to merge now, there are perhaps a few things that might need tweaking over time (and perhaps for Spack) but that should not pose a significant problem given the level of flexibility in the unified system.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/OP-DSL/OP2-Common/pull/219#issuecomment-983575526, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJWVVP6CBCKL22RDNAQHG3UOYFNNANCNFSM5IAA5ZAQ. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
I was using the default loaded by nvidia/nvhpc
which was 21.9 I think? I'll give it another go (if the login node stops timing out...).
@bozbez looks good. @reguly and @bozbez we should now move the current master to a release/2020 branch and merge this branch to master. Assuming this naming convention is ok. This PR is working with Hydra as I noted above, but I will need to push some minor modifications to Hydra which I can do soon.
Additionally, I was thinking of making a default develop branch (as we do in OPS). Did we have an agreement on if this is ok ? I would rather like to keep both OP2 and OPS similar as possible.
Thanks for setting this up! I agree with the separate branch for development!
On 2021. Dec 1., at 14:15, Gihan R. Mudalige @.***> wrote:
@gihanmudalige approved this pull request.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/OP-DSL/OP2-Common/pull/219#pullrequestreview-820240969, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJWVVMPXYBS5IIZ4WDJTLTUOYNYFANCNFSM5IAA5ZAQ. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
I have created the release/2020 branch and pushed it in. I also created a v1.0 tag at the same commit.
On 2021. Dec 1., at 14:26, István Reguly @.***> wrote:
Thanks for setting this up! I agree with the separate branch for development!
On 2021. Dec 1., at 14:15, Gihan R. Mudalige @. @.>> wrote:
@gihanmudalige approved this pull request.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/OP-DSL/OP2-Common/pull/219#pullrequestreview-820240969, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJWVVMPXYBS5IIZ4WDJTLTUOYNYFANCNFSM5IAA5ZAQ. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Personally I don't think we gain anything from having a develop branch and a master branch; the feature development is already done in separate feature branches and the stable versions are now in release branches.
I agree that we should try and align with OPS though, so there's that.
@reguly thanks for creating the relevant branches. So we are going with the convention of v1.0 etc. (no problem). I suppose this PR and also the completion of the other tasks will soon get us to v2.0 ? Particularly the new code gen should take us to 2.0 I think.
W.R.T develop branch, I think it will help external developers PR on to the develop branch and then us being able to test it and merge to develop. This way we are sure that the master branch is always working and we periodically merge in develop branch as major features are added. So in affect we have a leading edge branch and a "stable" branch.
What we are not used to doing (but we probably should!) is doing releases, which would also help address this master/develop branch difference (well, it would obviate it)
On 2021. Dec 1., at 15:14, Gihan R. Mudalige @.***> wrote:
@reguly thanks for creating the relevant branches. So we are going with the convention of v1.0 etc. (no problem). I suppose this PR and also the completion of the other tasks will soon get us to v2.0 ? Particularly the new code gen should take us to 2.0 I think.
W.R.T develop branch, I think it will help external developers PR on to the develop branch and then us being able to test it and merge to develop. This way we are sure that the master branch is always working and we periodically merge in develop branch as major features are added. So in affect we have a leading edge branch and a "stable" branch.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
Changes
op2/c
andop2/fortran
source folders, withop2/fortran
moving toop2/include/fortran
andop2/src/fortran
.apps/
, with a newmake generate
rule to build.New build system
The old build system was entirely removed and a new one with the following structure implemented:
makefiles/common.mk
- This sets variables common to both the OP2 and app builds, such as the OP2 include and library directories as well as the compilers and flags (by includingmakefiles/compilers/*
) and the include/link flags of the dependencies.makefiles/compilers/{c, c_cuda, fortran}/*.mk
- These are Makefiles follow a common structure and define the compiler executables, flags, and features for each compiler. The selection of these is controlled byOP2_COMPILER
and/orOP2_{C, C_CUDA, FORTRAN}_COMPILER
makefiles/profiles/*.mk
- These follow a custom structure that splits the Makefiles into a "pre" and "post" section, where the "pre" section is included at the start ofcommon.mk
and the "post" section at the end. The intention here is to allow for machine specific configuration of compilers, libraries and flags without needing to change the generic compiler makefiles. Variables set in the "pre" section override the defaults set bycommon.mk
and the compiler makefiles, while variables set in the "post" section can append to them.makefiles/{c, f}_app.mk
- These are generic makefiles that define rules to build all of the possible app variants (e.g.[mpi_]{seq, genseq, openmp, openmp4, cuda}
), and are included by each of the makefiles in the app directories with a selection of input variables such asAPP_NAME
andVARIANT_FILTER
.op2/Makefile
- This defines the build for the OP2 libraries inop2/lib
(by default). Object lists for each of the static libraries are first defined, and then the object builds are handled by pattern rules.apps/*/Makefile
- These simply defineAPP_NAME
and sometimesVARIANT_FILTER[_OUT]
and then includecommon.mk
and{f, c}_app.mk
.As well as reducing duplication and inconsistency, the new system also allows for parallel builds of the OP2 libraries as well as the apps. In addition, incremental builds with automatically generated header dependencies (via
-MMD -MP
) now work.Building
The process is kept as similar as possible to the previous system, but with more configurability. For a simple build:
export OP2_COMPILER={gnu, cray, intel, ...}
export {PARMETIS, PTSCOTCH, HDF5}_INSTALL_PATH=<path>
cd op2; make -j
.cd apps/c/airfoil_hdf5; make generate; make -j
.TODO
nvfortran
.#include
s in headers that occasionally break incremental compilation._openmp4
app builds with an offloading-enabled compiler.