Closed uturuncoglu closed 4 years ago
Is anyone working on this? Otherwise I'll take a look.
I just looked at this and I think it is because the mpi.mod module under mpt is not compatible with gnu 9.1.0. I was able to get the project to build using openmpi and gnu 9.1.0 instead of mpt, however.
@mark-a-potts We are using MPT on Cheyenne as a default MPI. Do you think that it still fails with MPT?
@uturuncoglu @mark-a-potts We have been building and testing the UFS successfully with GNU 8.3.0 and MPT - do you want to try if this works? Note also the discussion about compiler versions in issue https://github.com/ufs-community/ufs-mrweather-app/issues/13.
I was in the midst of testing MPT with gnu 8.3.0 when I got booted off of Cheyenne for taking up too much of the head node. I think that the problem is with the MPT module on Cheyenne not being compatible with gnu 9.1.0, though.
-Mark
On 12/16/19 5:25 PM, Ufuk Turunçoğlu wrote:
@mark-a-potts https://github.com/mark-a-potts We are using MPT on Cheyenne as a default MPI. Do you think that it still fails with MPT?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-mrweather-app/issues/7?email_source=notifications&email_token=AH4Q2UV22URTE7CEVUFGPKDQY7533A5CNFSM4JVQC6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHAKNXY#issuecomment-566273759, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH4Q2US5UIFO5PC4CVE2UPTQY7533ANCNFSM4JVQC6UA.
-- Mark A. Potts, Ph.D. Sr. HPC Software Developer RedLine Performance Solutions, LLC Phone 202-744-9469 Mark.Potts@noaa.gov mpotts@redlineperf.com
@mark-a-potts - you need to use qcmd on cheyenne in order not to be booted off. See https://dailyb.cisl.ucar.edu/bulletins/cisl-adds-qcmd-script-launching-resource-intensive-compilation-jobs .
On Mon, Dec 16, 2019 at 7:46 PM Mark Potts notifications@github.com wrote:
I was in the midst of testing MPT with gnu 8.3.0 when I got booted off of Cheyenne for taking up too much of the head node. I think that the problem is with the MPT module on Cheyenne not being compatible with gnu 9.1.0, though.
-Mark
On 12/16/19 5:25 PM, Ufuk Turunçoğlu wrote:
@mark-a-potts https://github.com/mark-a-potts We are using MPT on Cheyenne as a default MPI. Do you think that it still fails with MPT?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/ufs-community/ufs-mrweather-app/issues/7?email_source=notifications&email_token=AH4Q2UV22URTE7CEVUFGPKDQY7533A5CNFSM4JVQC6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHAKNXY#issuecomment-566273759>,
or unsubscribe < https://github.com/notifications/unsubscribe-auth/AH4Q2US5UIFO5PC4CVE2UPTQY7533ANCNFSM4JVQC6UA .
-- Mark A. Potts, Ph.D. Sr. HPC Software Developer RedLine Performance Solutions, LLC Phone 202-744-9469 Mark.Potts@noaa.gov mpotts@redlineperf.com
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-mrweather-app/issues/7?email_source=notifications&email_token=AB4XCEZ2PAYDBTQKSCNHAYDQZA4QHA5CNFSM4JVQC6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHA6KMQ#issuecomment-566355250, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4XCE2M7MTNALG6SXRCGFLQZA4QHANCNFSM4JVQC6UA .
Okay, I think I have figured out the fix/workaround. To get the right includes, the build needs to use the mpif90 wrapper from mpt rather than gfortran. So, if you set the following environment variables before running the cmake command, things seem to work (for gnu 8.3.0, at least)--
export FC=mpif90 export CC=mpicc export CXX=mpicxx
cmake -DMPITYPE=mpt -DCMAKE_INSTALL_PREFIX=$PWD/install ..
make
-M
On 12/16/19 5:38 PM, Dom Heinzeller wrote:
@uturuncoglu https://github.com/uturuncoglu @mark-a-potts https://github.com/mark-a-potts We have been building and testing the UFS successfully with GNU 8.3.0 and MPT - do you want to try if this works? Note also the discussion about compiler versions in issue
13 https://github.com/ufs-community/ufs-mrweather-app/issues/13.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-mrweather-app/issues/7?email_source=notifications&email_token=AH4Q2UQ3SLDX5AQDILUNK6TQY77QFA5CNFSM4JVQC6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHALX3A#issuecomment-566279148, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH4Q2UWRIZO2J22US67WTITQY77QFANCNFSM4JVQC6UA.
-- Mark A. Potts, Ph.D. Sr. HPC Software Developer RedLine Performance Solutions, LLC Phone 202-744-9469 Mark.Potts@noaa.gov mpotts@redlineperf.com
Hello
Has this workaround been tested ? If this is documented then can we close this ticket?
@climbfuji Can this ticket be closed?
I haven't tested this myself yet.
@mark-a-potts I am plaining to test the model in an another platform but i am getting following error, when i try to install NCEPLIBS on Stampede2.
Permission denied (publickey).
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
fatal: clone of 'git@github.com:NOAA-EMC/netcdf-c.git' into submodule path '/scratch/01118/tg803972/PROGS/NCEPLIBS.dec30/netcdf' failed
I think that it is related with the entry in the .gitmodules file. The netcdf is configured to use ssh but others are fine and use https.
...
[submodule "NCEPLIBS-post"]
path = NCEPLIBS-post
url = https://github.com/climbfuji/EMC_post
branch = update_ufs_release_1p0_macos_gnu
[submodule "netcdf"]
path = netcdf
url = git@github.com:NOAA-EMC/netcdf-c.git
branch = update_ufs_release_1p0_macos_gnu
[submodule "UFS_UTILS"]
path = UFS_UTILS
url = https://github.com/climbfuji/UFS_UTILS.git
branch = update_ufs_release_1p0_macos_gnu
...
So, i think that netcdf also need to use https to allow cloning without Git ssh setup.
@mark-a-potts NCEPLIBS-bufr also gives following error. The hash might be wrong.
Submodule path 'NCEPLIBS-bacio': checked out 'bf2f2261e9f425e04874205fc106ae6a52bb5bb8'
error: no such remote ref 0c5aaf0efc7b2562ba5b3d8ed3473db8921f95f8
Fetched in submodule path 'NCEPLIBS-bufr', but it did not contain 0c5aaf0efc7b2562ba5b3d8ed3473db8921f95f8. Direct fetching of that commit failed.
We should decide on using either ssh or https for the submodules (I prefer ssh), but you should be able to pull the submodule if you upload your id_rsa.pub key from $HOME/.ssh on stampede to your github account. Alternatively, you can change the url in .gitmodules to use the https:// nomenclature instead of ssh.
-M
On 12/30/19 2:13 PM, Ufuk Turunçoğlu wrote:
@mark-a-potts https://github.com/mark-a-potts I am plaining to test the model another platform but i am getting following error, when i try to install NCEPLIBS on Stampede2.
|Permission denied (publickey). fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. fatal: clone of 'git@github.com:NOAA-EMC/netcdf-c.git' into submodule path '/scratch/01118/tg803972/PROGS/NCEPLIBS.dec30/netcdf' failed |
I think that it is related with the entry in the .gitmodules file. The netcdf is configured to use ssh but others are fine and used https.
|... [submodule "NCEPLIBS-post"] path = NCEPLIBS-post url = https://github.com/climbfuji/EMC_post branch = update_ufs_release_1p0_macos_gnu [submodule "netcdf"] path = netcdf url = git@github.com:NOAA-EMC/netcdf-c.git branch = update_ufs_release_1p0_macos_gnu [submodule "UFS_UTILS"] path = UFS_UTILS url = https://github.com/climbfuji/UFS_UTILS.git branch = update_ufs_release_1p0_macos_gnu ... |
So, i think that netcdf also need to use https to allow cloning without Git ssh setup.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-mrweather-app/issues/7?email_source=notifications&email_token=AH4Q2UWSXPCC7LCRFDPVEBTQ3JB4NA5CNFSM4JVQC6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH27RDI#issuecomment-569768077, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH4Q2UVIBBLAAEK2JQI7SKTQ3JB4NANCNFSM4JVQC6UA.
-- Mark A. Potts, Ph.D. Sr. HPC Software Developer RedLine Performance Solutions, LLC Phone 202-744-9469 Mark.Potts@noaa.gov mpotts@redlineperf.com
Try running "git remote update" from NCEPLIBS-bufr. I just pushed that commit up to github earlier today, so it is probably not in your local repo.
-M
On 12/30/19 2:18 PM, Ufuk Turunçoğlu wrote:
@mark-a-potts https://github.com/mark-a-potts NCEPLIBS-bufr also gives following error. The hash might be wrong.
|Submodule path 'NCEPLIBS-bacio': checked out 'bf2f2261e9f425e04874205fc106ae6a52bb5bb8' error: no such remote ref 0c5aaf0efc7b2562ba5b3d8ed3473db8921f95f8 Fetched in submodule path 'NCEPLIBS-bufr', but it did not contain 0c5aaf0efc7b2562ba5b3d8ed3473db8921f95f8. Direct fetching of that commit failed. |
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-mrweather-app/issues/7?email_source=notifications&email_token=AH4Q2UWG7HIG2KKJE5C2COTQ3JCPLA5CNFSM4JVQC6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH274TA#issuecomment-569769548, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH4Q2UVZCYWGJBKU4JSSQADQ3JCPLANCNFSM4JVQC6UA.
-- Mark A. Potts, Ph.D. Sr. HPC Software Developer RedLine Performance Solutions, LLC Phone 202-744-9469 Mark.Potts@noaa.gov mpotts@redlineperf.com
For all libraries, you can use the ufs release branch rather than a single head. I suggest that you directly checkout the ufs_release_v1.0 branch under the NCEPLIBS-bufr repo. Kyle has recently checked that all ufs_release branches under each NCEPLIBS repo is working identical with the latest operational updates.
On Mon, Dec 30, 2019 at 2:24 PM Mark Potts notifications@github.com wrote:
We should decide on using either ssh or https for the submodules (I prefer ssh), but you should be able to pull the submodule if you upload your id_rsa.pub key from $HOME/.ssh on stampede to your github account. Alternatively, you can change the url in .gitmodules to use the https:// nomenclature instead of ssh.
-M
On 12/30/19 2:13 PM, Ufuk Turunçoğlu wrote:
@mark-a-potts https://github.com/mark-a-potts I am plaining to test the model another platform but i am getting following error, when i try to install NCEPLIBS on Stampede2.
|Permission denied (publickey). fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. fatal: clone of 'git@github.com:NOAA-EMC/netcdf-c.git' into submodule path '/scratch/01118/tg803972/PROGS/NCEPLIBS.dec30/netcdf' failed |
I think that it is related with the entry in the .gitmodules file. The netcdf is configured to use ssh but others are fine and used https.
|... [submodule "NCEPLIBS-post"] path = NCEPLIBS-post url = https://github.com/climbfuji/EMC_post branch = update_ufs_release_1p0_macos_gnu [submodule "netcdf"] path = netcdf url = git@github.com:NOAA-EMC/netcdf-c.git branch = update_ufs_release_1p0_macos_gnu [submodule "UFS_UTILS"] path = UFS_UTILS url = https://github.com/climbfuji/UFS_UTILS.git branch = update_ufs_release_1p0_macos_gnu ... |
So, i think that netcdf also need to use https to allow cloning without Git ssh setup.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/ufs-community/ufs-mrweather-app/issues/7?email_source=notifications&email_token=AH4Q2UWSXPCC7LCRFDPVEBTQ3JB4NA5CNFSM4JVQC6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH27RDI#issuecomment-569768077>,
or unsubscribe < https://github.com/notifications/unsubscribe-auth/AH4Q2UVIBBLAAEK2JQI7SKTQ3JB4NANCNFSM4JVQC6UA .
-- Mark A. Potts, Ph.D. Sr. HPC Software Developer RedLine Performance Solutions, LLC Phone 202-744-9469 Mark.Potts@noaa.gov mpotts@redlineperf.com
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-mrweather-app/issues/7?email_source=notifications&email_token=AKWSMFHAGHSEYC3RDF4XDNTQ3JDHBA5CNFSM4JVQC6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH3AJFI#issuecomment-569771157, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWSMFCMD6Q3WGC4FOYUGI3Q3JDHBANCNFSM4JVQC6UA .
@mark-a-potts NCEPLIBS-bufr seems to use develop branch. Do i need to use update_ufs_release_1p0_macos_gnu?
Last commit for develop is
commit afaa8a002a366ebadc74db9c469255c125cec309
Author: Dexin.Zhang <dexin.zhang@noaa.gov>
Date: Fri Oct 4 19:49:54 2019 +0000
Unified build (20191004) script/makefile bufr
Mark - My recommendation would be to use the https access, so that anonymous users can clone the submodules without requiring a GitHub account. Makes it much simpler to teach beginning students, as well!
Laurie
On Mon, Dec 30, 2019 at 12:24 PM Mark Potts notifications@github.com wrote:
We should decide on using either ssh or https for the submodules (I prefer ssh), but you should be able to pull the submodule if you upload your id_rsa.pub key from $HOME/.ssh on stampede to your github account. Alternatively, you can change the url in .gitmodules to use the https:// nomenclature instead of ssh.
-M
On 12/30/19 2:13 PM, Ufuk Turunçoğlu wrote:
@mark-a-potts https://github.com/mark-a-potts I am plaining to test the model another platform but i am getting following error, when i try to install NCEPLIBS on Stampede2.
|Permission denied (publickey). fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. fatal: clone of 'git@github.com:NOAA-EMC/netcdf-c.git' into submodule path '/scratch/01118/tg803972/PROGS/NCEPLIBS.dec30/netcdf' failed |
I think that it is related with the entry in the .gitmodules file. The netcdf is configured to use ssh but others are fine and used https.
|... [submodule "NCEPLIBS-post"] path = NCEPLIBS-post url = https://github.com/climbfuji/EMC_post branch = update_ufs_release_1p0_macos_gnu [submodule "netcdf"] path = netcdf url = git@github.com:NOAA-EMC/netcdf-c.git branch = update_ufs_release_1p0_macos_gnu [submodule "UFS_UTILS"] path = UFS_UTILS url = https://github.com/climbfuji/UFS_UTILS.git branch = update_ufs_release_1p0_macos_gnu ... |
So, i think that netcdf also need to use https to allow cloning without Git ssh setup.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/ufs-community/ufs-mrweather-app/issues/7?email_source=notifications&email_token=AH4Q2UWSXPCC7LCRFDPVEBTQ3JB4NA5CNFSM4JVQC6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH27RDI#issuecomment-569768077>,
or unsubscribe < https://github.com/notifications/unsubscribe-auth/AH4Q2UVIBBLAAEK2JQI7SKTQ3JB4NANCNFSM4JVQC6UA .
-- Mark A. Potts, Ph.D. Sr. HPC Software Developer RedLine Performance Solutions, LLC Phone 202-744-9469 Mark.Potts@noaa.gov mpotts@redlineperf.com
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-mrweather-app/issues/7?email_source=notifications&email_token=AB2OWIXNWMRB7JXRNTQFQMTQ3JDHDA5CNFSM4JVQC6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH3AJFI#issuecomment-569771157, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2OWIXPIHO7JTMZ3N6U4O3Q3JDHDANCNFSM4JVQC6UA .
yes, i am agree with @llpcarson.
Okay, I figured out what is going on. I pushed up changes that included Dom's submodule links. I have reverted those changes, so you should be able to clone again. I will push my changes (correctly) later today or tomorrow. Sorry for the confusion.
-M
On 12/30/19 2:36 PM, Ufuk Turunçoğlu wrote:
Last commit for develop is
|commit afaa8a002a366ebadc74db9c469255c125cec309 Author: Dexin.Zhang dexin.zhang@noaa.gov Date: Fri Oct 4 19:49:54 2019 +0000 Unified build (20191004) script/makefile bufr |
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-mrweather-app/issues/7?email_source=notifications&email_token=AH4Q2UXGPIHCHT7645GOCA3Q3JEVRA5CNFSM4JVQC6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH3BBEI#issuecomment-569774225, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH4Q2UV4RFZDO74TNWJQ3NLQ3JEVRANCNFSM4JVQC6UA.
-- Mark A. Potts, Ph.D. Sr. HPC Software Developer RedLine Performance Solutions, LLC Phone 202-744-9469 Mark.Potts@noaa.gov mpotts@redlineperf.com
Hi Mark & everyone else,
a couple of things:
(1) When PRs are created, the creator must temporarily modify .gitmodules to point to his/her fork and branch if updates are required for submodules.
(2) Merging code from PRs with submodules requires coordination with the person making the PRs. From the "innermost" nested PR up to the top-level PR, the PRs need to be merged as-is. After each merge, the person creating the PRs has to update his/her local code to check out the merged version, revert the change to .gitmodules, and push this to GitHub to update the PR. And so on and so forth.
(3) Checking out the code ufs_release_1.0 should always be as follows: git clone https://github.com/NOAA-EMC/NCEPlibs cd NCEPlibs git checkout ufs_release_1.00 git submodule update --init --recursive
(4) Checking out a PR with id ID for testing it should always be as follows: git clone https://github.com/NOAA-EMC/NCEPlibs cd NCEPlibs git fetch origin pull/ID/head:BRANCHNAME git checkout BRANCHNAME git submodule update --init --recursive
For my current NCEPLIBS PRs, this is simple: first, accept and merge the cmake PR - which you did. Then let me update cmake to point back to NOAA-EMC in all repositories that use the cmake modules repository as submodule - which I did. Then merge the NCEPLIBS-xyz PRs (can be done in parallel) - hasn't been done yet. Then let me update my NCEPLIBS umbrella PR to point the submodules back to NOAA-EMC. Then merge this one. Done.
This is the approach the ufs-weather-model has taken after evaluating all the pros and cons for code reviewers, code developers and users.
Best,
Dom
On Dec 30, 2019, at 1:10 PM, Mark Potts notifications@github.com wrote:
Okay, I figured out what is going on. I pushed up changes that included Dom's submodule links. I have reverted those changes, so you should be able to clone again. I will push my changes (correctly) later today or tomorrow. Sorry for the confusion.
-M
On 12/30/19 2:36 PM, Ufuk Turunçoğlu wrote:
Last commit for develop is
|commit afaa8a002a366ebadc74db9c469255c125cec309 Author: Dexin.Zhang dexin.zhang@noaa.gov Date: Fri Oct 4 19:49:54 2019 +0000 Unified build (20191004) script/makefile bufr |
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-mrweather-app/issues/7?email_source=notifications&email_token=AH4Q2UXGPIHCHT7645GOCA3Q3JEVRA5CNFSM4JVQC6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH3BBEI#issuecomment-569774225, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH4Q2UV4RFZDO74TNWJQ3NLQ3JEVRANCNFSM4JVQC6UA.
-- Mark A. Potts, Ph.D. Sr. HPC Software Developer RedLine Performance Solutions, LLC Phone 202-744-9469 Mark.Potts@noaa.gov mpotts@redlineperf.com
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-mrweather-app/issues/7?email_source=notifications&email_token=AB5C2RJYWPF7Z74B35OILYTQ3JIS3A5CNFSM4JVQC6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH3DFPQ#issuecomment-569782974, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5C2RL5G3YACJ243DHI5E3Q3JIS3ANCNFSM4JVQC6UA.
@mark-a-potts I tried with fresh install but i am still getting following error
Submodule path 'NCEPLIBS-bacio': checked out 'bf2f2261e9f425e04874205fc106ae6a52bb5bb8'
Submodule path 'NCEPLIBS-bufr': checked out 'afaa8a002a366ebadc74db9c469255c125cec309'
error: no such remote ref 88d78c774003566b180c593eaf2758ad60b1af73
Fetched in submodule path 'NCEPLIBS-crtm', but it did not contain 88d78c774003566b180c593eaf2758ad60b1af73. Direct fetching of that commit failed.
Do i need to issue git update?
@mark-a-potts I tried with fresh install but i am still getting following error
Submodule path 'NCEPLIBS-bacio': checked out 'bf2f2261e9f425e04874205fc106ae6a52bb5bb8' Submodule path 'NCEPLIBS-bufr': checked out 'afaa8a002a366ebadc74db9c469255c125cec309' error: no such remote ref 88d78c774003566b180c593eaf2758ad60b1af73 Fetched in submodule path 'NCEPLIBS-crtm', but it did not contain 88d78c774003566b180c593eaf2758ad60b1af73. Direct fetching of that commit failed.
Do i need to issue git update?
I would wait until the merge of my PR has been reverted properly and then check out the code as I wrote above.
@climbfuji Ok. Let me know when it is ready and i could test it.
Okay, I think I have the submodule pointers back to Kyle's commit of Dec
-M
On 12/30/19 3:39 PM, Dom Heinzeller wrote:
@mark-a-potts <https://github.com/mark-a-potts> I tried with fresh install but i am still getting following error |Submodule path 'NCEPLIBS-bacio': checked out 'bf2f2261e9f425e04874205fc106ae6a52bb5bb8' Submodule path 'NCEPLIBS-bufr': checked out 'afaa8a002a366ebadc74db9c469255c125cec309' error: no such remote ref 88d78c774003566b180c593eaf2758ad60b1af73 Fetched in submodule path 'NCEPLIBS-crtm', but it did not contain 88d78c774003566b180c593eaf2758ad60b1af73. Direct fetching of that commit failed. | Do i need to issue git update?
I would wait until the merge of my PR has been reverted properly and then check out the code as I wrote above.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-mrweather-app/issues/7?email_source=notifications&email_token=AH4Q2UWLUAD4PMUOAZESZSLQ3JMABA5CNFSM4JVQC6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH3E6AY#issuecomment-569790211, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH4Q2UVRV3DYYLLJXRP37UTQ3JMABANCNFSM4JVQC6UA.
-- Mark A. Potts, Ph.D. Sr. HPC Software Developer RedLine Performance Solutions, LLC Phone 202-744-9469 Mark.Potts@noaa.gov mpotts@redlineperf.com
Thanks @mark-a-potts. @uturuncoglu did you have a chance to test my versions of NCEPlibs on Cheyenne? If they work for you, we can proceed with merging the PRs into the ufs_release_1.0 branches for the submodules NCEPLIBS-*, and do the final commit to the umbrella repository.
None of the changes I made/propose to the submodules will be effective until the last commit is made (because checking out the code as described above will give you the hashes that the top-level umbrella repository is pointing to.
We need the changes if we want to give Phil and his team something to test (on other/generic platforms).
@climbfuji do i need to use following command yo test your version
(4) Checking out a PR with id ID for testing it should always be as follows:
git clone https://github.com/NOAA-EMC/NCEPlibs
cd NCEPlibs
git fetch origin pull/ID/head:BRANCHNAME
git checkout BRANCHNAME
git submodule update --init --recursive
what is the BRANCHNAME
No, please just point directly to the installation directories that I listed above, do not build the code.
This seems to be a general misunderstanding anyway. I was under the impression that we do not want CIME to build the NCEPlibs. This is something the user should do following the user's guide. But maybe I am wrong.
On Dec 31, 2019, at 10:11 AM, Ufuk Turunçoğlu notifications@github.com wrote:
@climbfuji https://github.com/climbfuji do i need to use following command yo test your version
(4) Checking out a PR with id ID for testing it should always be as follows: git clone https://github.com/NOAA-EMC/NCEPlibs cd NCEPlibs git fetch origin pull/ID/head:BRANCHNAME git checkout BRANCHNAME git submodule update --init --recursive what is the BRANCHNAME
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-mrweather-app/issues/7?email_source=notifications&email_token=AB5C2RN2EWXDKTPBKQBATMTQ3N4M3A5CNFSM4JVQC6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH4ORNA#issuecomment-569960628, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5C2RKJJFAHGSXMIRJ4TTTQ3N4M3ANCNFSM4JVQC6UA.
@climbfui @Ufuk Turuncoglu turuncu@ucar.edu - we definitely do not want CIME to build the NCEP libs.
On Tue, Dec 31, 2019 at 10:24 AM Dom Heinzeller notifications@github.com wrote:
No, please just point directly to the installation directories that I listed above, do not build the code.
This seems to be a general misunderstanding anyway. I was under the impression that we do not want CIME to build the NCEPlibs. This is something the user should do following the user's guide. But maybe I am wrong.
On Dec 31, 2019, at 10:11 AM, Ufuk Turunçoğlu notifications@github.com wrote:
@climbfuji https://github.com/climbfuji do i need to use following command yo test your version
(4) Checking out a PR with id ID for testing it should always be as follows: git clone https://github.com/NOAA-EMC/NCEPlibs cd NCEPlibs git fetch origin pull/ID/head:BRANCHNAME git checkout BRANCHNAME git submodule update --init --recursive what is the BRANCHNAME
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/ufs-community/ufs-mrweather-app/issues/7?email_source=notifications&email_token=AB5C2RN2EWXDKTPBKQBATMTQ3N4M3A5CNFSM4JVQC6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH4ORNA#issuecomment-569960628>, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AB5C2RKJJFAHGSXMIRJ4TTTQ3N4M3ANCNFSM4JVQC6UA .
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-mrweather-app/issues/7?email_source=notifications&email_token=AB4XCE5447FWVG74FUNXPRTQ3N53RA5CNFSM4JVQC6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH4O6SQ#issuecomment-569962314, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4XCEYNQQL4EVIFFCBII3LQ3N53RANCNFSM4JVQC6UA .
@climbfuji You mean pointing NCEPLIBS installation directories on Cheyenne. Right? It is little bit confusing for me because i am trying to install NCEPLIBS on Stampede and test it over there. Do you want me to test the model build with your NCEPLIBS installation? BTW, i could not find the directories in the previous posts.
@climbfuji @mark-a-potts I have successfully install the NCEPLIBS on stampede2 but when i run the chgres, i am getting error as following
- FATAL ERROR: IN GridCreateMosaic
- IOSTAT IS: 49
- FATAL ERROR: IN GridCreateMosaic
I look for the error and it seems that chgres is failing in the following call (UFS_UTILS/sorc/chgres_cube.fd/model_grid.F90)
print*,"- CALL GridCreateMosaic FOR TARGET GRID"
target_grid = ESMF_GridCreateMosaic(filename=trim(mosaic_file_target_grid), &
regDecompPTile=decomptile, &
staggerLocList=(/ESMF_STAGGERLOC_CENTER, ESMF_STAGGERLOC_CORNER, &
ESMF_STAGGERLOC_EDGE1, ESMF_STAGGERLOC_EDGE2/), &
indexflag=ESMF_INDEX_GLOBAL, &
tileFilePath=trim(orog_dir_target_grid), rc=error)
if(ESMF_logFoundError(rcToCheck=error,msg=ESMF_LOGERR_PASSTHRU,line=__line__,file=__file__)) &
call error_handler("IN GridCreateMosaic", error)
The files that are used in this call are in the right place and they seems correct. In this case, i am using both netCDF and ESMF installations from NCEP_LIBS and i am using following environment on stampede2
Currently Loaded Modules:
1) git/2.9.0 2) autotools/1.1 3) xalt/2.7.9 4) TACC 5) intel/18.0.2 6) libfabric/1.7.0 7) cmake/3.10.2 8) impi/18.0.2 9) python2/2.7.15
It could be an issue related with ESMF installation but to be sure i need to look at the flags that is used for ESMF build. In the meantime, do you have any idea about the error. I just wonder that is anybody tested chgres using NCEP_LIBS build with intel MPI and compiler combination?
I think i found the problem. The ESMF that is shipped with NCEPLIBS is not compiled with netCDF support. If i look at esmf.mk file in my installation, it looks as follows,
#
# !!! The following options were used on this ESMF build !!!
#
# ESMF_DIR: /work/01118/tg803972/stampede2/UFS/NCEPLIBS/esmf
# ESMF_OS: Linux
# ESMF_MACHINE: x86_64
# ESMF_ABI: 64
# ESMF_COMPILER: intel
# ESMF_BOPT: O
# ESMF_COMM: intelmpi
# ESMF_SITE: default
# ESMF_PTHREADS: ON
# ESMF_OPENMP: ON
# ESMF_OPENACC: OFF
# ESMF_ARRAY_LITE: FALSE
# ESMF_NO_INTEGER_1_BYTE: TRUE
# ESMF_NO_INTEGER_2_BYTE: TRUE
# ESMF_FORTRANSYMBOLS: default
# ESMF_MAPPER_BUILD: OFF
# ESMF_AUTO_LIB_BUILD: ON
# ESMF_DEFER_LIB_BUILD: ON
# ESMF_SHARED_LIB_BUILD: ON
#
# ESMF environment variables pointing to 3rd party software:
# ESMF_MOAB: internal
# ESMF_LAPACK: internal
# ESMF_ACC_SOFTWARE_STACK: none
# ESMF_YAMLCPP: internal
# ESMF_PIO: internal
and there is no ESMF_NETCDF definition. Also, if you look at the ESMF documentation
The error code 49 is for ESMF_RC_LIB_NOT_PRESENT. I was thinking that the NCEPLIBS superbuild is also tested with chgres. Am i wrong? I did not faced this problem on Cheyenne before because i was pointing my own ESMF installation there and it was coming with netCDF support.
I think that ESMF shipped with NCEPLIBS is needed to be build with netCDF support using netCDF packed with NCEPLIBS. To do that, you just need to set following environment variable but in this case nc-config needs to point the netCDF installation of NCEPLIBS.
export ESMF_NETCDF=nc-config
More update about the issue:
I try to install ESMF with the netCDF shipped with NCEPLIBS externally. When i set the environment variable as ESMF_NETCDF=nc-config it gives error like following,
/work/01118/tg803972/stampede2/UFS/ESMF/8.0.0/build/common.mk:1383: *** invalid syntax in conditional. Stop.
This is because ESMF try to use nf-config --prefix command to get information about the Fortran installation but this command also gives following error
nf-config not yet implemented for cmake builds
So, nf_config command which is required for ESMF installation is not working when netCDF is build with cmake. So, somehow we need to find a workaround for it until netcdf cmake build supports nf-config.
The easiest solution is the defining all NETCDF environment variables required by ESMF explicitly rather than using nc-config option in ESMF_NETCDF environment variable. So, following environment definition is worked for me on stampede2 at least for external ESMF installation that uses NCEPLIBS netCDF:
export ESMF_NETCDF=split
export ESMF_NETCDF_INCLUDE=/work/01118/tg803972/stampede2/UFS/NCEPLIBS/build-all/install/include
export ESMF_NETCDF_LIBS="-lnetcdff -lnetcdf"
export ESMF_NETCDF_LIBPATH=/work/01118/tg803972/stampede2/UFS/NCEPLIBS/build-all/install/lib64
Thanks Ufuk.
I think we can get that put into the top level CMakeLists.txt file so that everything gets picked up correctly. I'll take a look at it on Monday.
-M
On 1/3/20 5:36 PM, Ufuk Turunçoğlu wrote:
More update about the issue:
I try to install ESMF with the netCDF shipped with NCEPLIBS externally. When i set the environment variable ESMF_NETCDF it gives error like following,
|/work/01118/tg803972/stampede2/UFS/ESMF/8.0.0/build/common.mk:1383: *** invalid syntax in conditional. Stop. |
This is because ESMF try to use nf-config --prefix command to get information about the Fortran installation but this command also gives following error
|nf-config not yet implemented for cmake builds |
So, nf_config command which is required for ESMF installation is not working when netCDF is build with cmake. So, some how we need to find a workaround for it until netcdf cmake build supports nf-config.
The easiest solution is the defining all NETCDF environment variables required by ESMF explicitly rather than using nc-config option in ESMF_NETCDF environment variable. So, following environment definition is worked for me on stampede2 at least for external ESMF installation that uses NCEPLIBS netCDF:
export ESMF_NETCDF=split export ESMF_NETCDF_INCLUDE=/work/01118/tg803972/stampede2/UFS/NCEPLIBS/build-all/install/include export ESMF_NETCDF_LIBS="-lnetcdff -lnetcdf" export ESMF_NETCDF_LIBPATH=/work/01118/tg803972/stampede2/UFS/NCEPLIBS/build-all/install/lib64 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-mrweather-app/issues/7?email_source=notifications&email_token=AH4Q2USFZQ4TWNWJILK7PFDQ364XVA5CNFSM4JVQC6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEICHXBI#issuecomment-570719109, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH4Q2UX6EQPCA76IJXMMSUTQ364XVANCNFSM4JVQC6UA.
-- Mark A. Potts, Ph.D. Sr. HPC Software Developer RedLine Performance Solutions, LLC Phone 202-744-9469 Mark.Potts@noaa.gov mpotts@redlineperf.com
Thanks @mark-a-potts. Let me know if you ned help, i could also test it on Stampede
@uturuncoglu I will have something for you to test later today, which includes a fix for the issue you reported.
@climbfuji Thanks.
@mark-a-potts @climbfuji Just for your information, i was trying to build NCEPLIBS with external ESMF installation to continue my development until this issue is solved but NCEPLIBS gives following error in chgres step. BTW, i did fresh checkout today.
-- Found HDF5: hdf5::hdf5-shared;hdf5::hdf5_fortran-shared (found version "1.10.5") found components: C HL
setting intel flags
sfcio lib is /work/01118/tg803972/stampede2/UFS/NCEPLIBS.jan06/build-all/install/lib/libsfcio_v1.1.0.a /work/01118/tg803972/stampede2/UFS/NCEPLIBS.jan06/build-all/install/lib/libw3nco_v2.0.6_d.a
CMake Warning (dev) at CMakeLists.txt:58 (add_subdirectory):
The source directory
/work/01118/tg803972/stampede2/UFS/NCEPLIBS.jan06/UFS_UTILS/sorc/global_chgres.fd
does not contain a CMakeLists.txt file.
CMake does not support this case but it used to work accidentally and is
being allowed for compatibility.
Policy CMP0014 is not set: Input directories must have CMakeLists.txt. Run
"cmake --help-policy CMP0014" for policy details. Use the cmake_policy
command to set the policy and suppress this warning.
This warning is for project developers. Use -Wno-dev to suppress it.
-- Configuring done
-- Generating done
-- Build files have been written to: /work/01118/tg803972/stampede2/UFS/NCEPLIBS.jan06/build-all/UFS_UTILS/src/UFS_UTILS-build
[ 68%] Performing build step for 'UFS_UTILS'
Scanning dependencies of target chgres_cube.exe
[ 9%] Building Fortran object sorc/chgres_cube.fd/CMakeFiles/chgres_cube.exe.dir/program_setup.f90.o
/work/01118/tg803972/stampede2/UFS/NCEPLIBS.jan06/UFS_UTILS/sorc/chgres_cube.fd/program_setup.f90(244): (col. 13) remark: program_setup_mp_calc_soil_params_driver_ has been targeted for automatic cpu dispatch
/work/01118/tg803972/stampede2/UFS/NCEPLIBS.jan06/UFS_UTILS/sorc/chgres_cube.fd/program_setup.f90(390): (col. 13) remark: program_setup_mp_calc_soil_params_ has been targeted for automatic cpu dispatch
[ 18%] Building Fortran object sorc/chgres_cube.fd/CMakeFiles/chgres_cube.exe.dir/model_grid.F90.o
/work/01118/tg803972/stampede2/UFS/NCEPLIBS.jan06/UFS_UTILS/sorc/chgres_cube.fd/model_grid.F90(58): error #7002: Error in opening the compiled module file. Check INCLUDE paths. [ESMF]
Sorry this is my fault, i set ESMF_INC wrongly. I'll test again.
No, after i fixed the environment variable, i am still getting same error. This was working before but it seems broken now. I am seeing following weird entry in the UFS_UTILS-build/CMakeCache.txt
/work/01118/tg803972/stampede2/UFS/ESMF/8.0.0/install_dir/lib/libO/Linux.intel.64.intelmpi.default/../mod
The ESMF environment variables defined in this case as following,
ESMF_NETCDF=split
ESMF_LIB=/work/01118/tg803972/stampede2/UFS/ESMF/8.0.0/install_dir/lib/libO/Linux.intel.64.intelmpi.default
ESMF_TESTTRACE=ON
LD_LIBRARY_PATH=/work/01118/tg803972/stampede2/UFS/ESMF/8.0.0/install_dir/lib/libO/Linux.intel.64.intelmpi.default:/opt/apps/intel18/python2/2.7.15/lib:/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi/intel64/lib:/opt/apps/libfabric/1.7.0/lib:/opt/intel/debugger_2018/libipt/intel64/lib:/opt/intel/debugger_2018/iga/lib:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64:/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64:/opt/apps/gcc/6.3.0/lib64:/opt/apps/gcc/6.3.0/lib:/work/01118/tg803972/stampede2/UFS/ESMF/8.0.0/install_dir/lib/libO/Linux.intel.64.intelmpi.default
ESMF_INC=/work/01118/tg803972/stampede2/UFS/ESMF/8.0.0/install_dir/mod/modO/Linux.intel.64.intelmpi.default
ESMF_YAMLCPP=internal
ESMF_DIR=/work/01118/tg803972/stampede2/UFS/ESMF/8.0.0
ESMF_NETCDF_LIBPATH=/work/01118/tg803972/stampede2/UFS/NCEPLIBS/build-all/install/lib64
ESMF_COMM=intelmpi
ESMF_INSTALL_PREFIX=/work/01118/tg803972/stampede2/UFS/ESMF/8.0.0/install_dir
ESMFMKFILE=/work/01118/tg803972/stampede2/UFS/ESMF/8.0.0/install_dir/lib/libO/Linux.intel.64.intelmpi.default/esmf.mk
ESMF_PIO=internal
ESMF_BOPT=O
__LMOD_REF_COUNT_LD_LIBRARY_PATH=/opt/apps/intel18/python2/2.7.15/lib:1;/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi/intel64/lib:1;/opt/apps/libfabric/1.7.0/lib:1;/opt/intel/debugger_2018/libipt/intel64/lib:1;/opt/intel/debugger_2018/iga/lib:1;/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/../tbb/lib/intel64_lin/gcc4.4:1;/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/lib/intel64_lin:1;/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.7:1;/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin:1;/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin:2;/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64:1;/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64:1;/opt/apps/gcc/6.3.0/lib64:1;/opt/apps/gcc/6.3.0/lib:1;/work/01118/tg803972/stampede2/UFS/ESMF/8.0.0/install_dir/lib/libO/Linux.intel.64.intelmpi.default:3
ESMF_NETCDF_LIBS=-lnetcdff -lnetcdf
ESMF_NETCDF_INCLUDE=/work/01118/tg803972/stampede2/UFS/NCEPLIBS/build-all/install/include
ESMF_COMPILER=intel
It seems that ESMF_INC is fixed as following in the top level CMakeLists.txt
# Note - this only works if the include directory sits next to the lib directory
set(ESMF_INC ${ESMF_LIBSDIR}/../mod)
So, it is better also to support user provided value for ESMF_INC because installation directory could be different than the source directory. Let me know what do you think?
Sorry, both of them are user provided
set(ESMF_LIB $ENV{ESMF_LIB})
set(ESMF_INC $ENV{ESMF_INC})
before.
Please wait ... a fix for this was discussed last week and merged into the ufs_release_1.0, but the submodule pointers were reverted back. Just wait until you get the clear-to-go from one of us, please, we don't want to waste your time.
On Jan 6, 2020, at 1:13 PM, Ufuk Turunçoğlu notifications@github.com wrote:
BTW, it was defined as
set(ESMF_INC ${CMAKE_INSTALL_PREFIX}/include)
before.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-mrweather-app/issues/7?email_source=notifications&email_token=AB5C2RPRDNYKJLBTDRAKTDDQ4OGFPA5CNFSM4JVQC6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIGUNYA#issuecomment-571295456, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5C2RMH55BI46EL3WHXUF3Q4OGFPANCNFSM4JVQC6UA.
@climbfuji Sure. Thanks.
@uturuncoglu @mark-a-potts @kgerheiser @DusanJovic-NOAA @Hang-Lei-NOAA and everyone else. Have a look at https://github.com/NOAA-EMC/NCEPLIBS/pull/12. I didn't mark it as open for review yet, because I only tested it on my Mac thus far (for two scenarios, building everything except the compiler and the mpi library, and only building ESMF and the NCEPlibs). If you want to check out the code and test building/using it with the model, please do so and provide feedback what works and what doesn't. Thanks.
@climbfuji Is this address the issue that i raised yesterday related with building NCEPLIBS with external ESMF installation? I am not sure because it still looks following directory
# Note - this only works if the include directory sits next to the lib directory
set(ESMF_INC ${ESMF_LIBSDIR}/../mod)
Anyway, i'll test it and let you know.
Yes, this issue is addressed. However I just found that the ESMF netCDF support is still not correct. I will push an update in a few minutes. Hold on.
Okay. I am waiting for your changes.
@climbfuji I could install your branch successfully but i could not test it yet with the model. I'll test it and let you know.
@climbfuji i tested your branch and i confirmed that i could run both chgres and ncep post without any problem. Thanks for your help. BTW, did you merge those changes to main repository.
Yes, this is now in ufs_release_v1.0. I will close this issue. We will need to make sure that the documentation contains using the MPI wrappers as compilers (this is a good thing anyway, not only for cheyenne).
I am trying to install NCEP LIBS with following module combination on Cheyenne but
1) ncarenv/1.3 2) gnu/9.1.0 3) mpt/2.19
4) netcdf-mpi/4.7.1
5) pnetcdf/1.11.1
6) ncarcompilers/0.5.0
7) esmf-8.0.0-ncdfio-mpt-O
8) cmake/3.14.4
it fails with following error
the commands that is used to install lib are followings