NOAA-EMC / UPP

Other
37 stars 100 forks source link

Refactor EMC_post decomposition from 1D to 2D as part of EMC_post refactoring #274

Closed GeorgeVandenberghe-NOAA closed 2 years ago

GeorgeVandenberghe-NOAA commented 3 years ago

EMC_post is currently decmposed on latitude (J) only. This is adequate for several more years but since post is generally being refactored, now is a good time to make the jump to 2D. A second goal is to make the 2D decomposition either flexible, or just have it mimic the ufs-weather-model decomposition so developers working on both codes can exploit commonality. This will be a modestly difficult project with most effort, figuring out the plumbing of the code (in progress). This issue is being created for management and project leader tracking and per EMC management directives and also best practices, results should be tracked through this Github issue or slack, NOT email.

There are many OTHER scaling issues in the post that are not affected by the decomposition. Most of the issues are orthogonal to the decomposition though and can be worked independently. The most salient is input I/O of model state fields in the standalone post.

By 03/01/2021:

GeorgeVandenberghe-NOAA commented 3 years ago

I finally have corners in the EXCH.f working. Corner exchange code looks like this

!! corner points. After the exchanges above, corner points are replicated in ! neighbour halos so we can get them from the neighbors rather than ! calculating more corner neighbor numbers ! A(ista-1,jsta-1) is in the ileft a(iend,jsta-1) location ! A(ista-1,jend+1) is in the ileft a(iend,jend+1) location ! A(iend+1,jsta-1) is in the iright a(ista,jsta-1) location ! A(iend+1,jend+1) is in the iright a(ista,jend+1) location ibl=max(ista-1,1) ibu=min(im,iend+1) jbu=min(jm,jend+1) jbl=max(jsta-1,1)

  call mpi_sendrecv(a(iend,jbl   ),1,    MPI_REAL,iright,1 ,            &
&                  a(ibl   ,jbl   ),1,   MPI_REAL,ileft ,1,           &
&                  MPI_COMM_COMP,status,ierr)

  call mpi_sendrecv(a(iend,jbu   ),1,    MPI_REAL,iright,1 ,            &
&                  a(ibl   ,jbu   ),1,   MPI_REAL,ileft ,1,           &
&                  MPI_COMM_COMP,status,ierr)
  call mpi_sendrecv(a(ista,jbl   ),1,    MPI_REAL,ileft ,1,            &
&                  a(ibu   ,jbl   ),1,   MPI_REAL,iright,1,           &
&                  MPI_COMM_COMP,status,ierr)
  call mpi_sendrecv(a(ista,jbu   ),1,    MPI_REAL,ileft ,1 ,            &
&                  a(ibu   ,jbu   ),1,   MPI_REAL,iright,1,           &
&                  MPI_COMM_COMP,status,ierr)

It does work. I need to strip out a lot of debug code before publishing I tested it by exchanging with an array of coordinates and ensuring coordinates matched I and J for all I, J from ista-1 to iend+1 and jsta-1 to jend+1 with the constraints these had to be in the bounds 1:im,1:jm A second possible test would to be to broadcast one full domain 2D array and then also scatter and exchange it and compare the exchanged results with the broadcast subcomain. These should be exact.

JesseMeng-NOAA commented 3 years ago

Great! Please let me know where to grab your files. wcoss production switched yesterday. I have my files backup and transfer to mars, but I don't have yours. I need your MPI_FIRST.f, CTLBLK.f, and EXCH.f, anything else you have modified? Did you also change the COLLECT?

GeorgeVandenberghe-NOAA commented 3 years ago

COLLECT is untested because I haven't gotten it called yet. I will send a tarball with the most recent versions. EXCH.f has a LOT of extra code right now to check the transfers.

I also haven't found anything calling EXCH2

it's on MARS /gpfs/dell2/pmp/gwv/post.0813.exch.tar

On Fri, Aug 13, 2021 at 2:05 PM JesseMeng-NOAA @.***> wrote:

Great! Please let me know where to grab your files. wcoss production switched yesterday. I have my files backup and transfer to mars, but I don't have yours. I need your MPI_FIRST.f, CTLBLK.f, and EXCH.f. Anything else you have modified? Did you also change the COLLECT?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/EMC_post/issues/274#issuecomment-898632319, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FXR76A5PKVMQX7AMXDT4VNGNANCNFSM4YVYMD5Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

--

George W Vandenberghe

IMSG at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

JesseMeng-NOAA commented 3 years ago

George's new fix works! The tests reproduced bitwise identical results including SLP for numx=1,2,4. Thanks, George!

GeorgeVandenberghe-NOAA commented 3 years ago

Well that's a good way to end a Friday.

Sometimes we win one😐

On Fri, Aug 13, 2021 at 4:51 PM JesseMeng-NOAA @.***> wrote:

George's new fix works! The tests reproduced bitwise identical results including SLP for numx=1,2,4. Thanks, George!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/EMC_post/issues/274#issuecomment-898713472, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FSZ5MZWG4J7PKJ6DA3T4WAVLANCNFSM4YVYMD5Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

--

George W Vandenberghe

IMSG at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

HuiyaChuang-NOAA commented 3 years ago

Totally agreed! Thank you!

Huiya

Sent from my iPhone

On Aug 13, 2021, at 5:28 PM, GeorgeVandenberghe-NOAA @.***> wrote:

 Well that's a good way to end a Friday.

Sometimes we win one😐

On Fri, Aug 13, 2021 at 4:51 PM JesseMeng-NOAA @.***> wrote:

George's new fix works! The tests reproduced bitwise identical results including SLP for numx=1,2,4. Thanks, George!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/EMC_post/issues/274#issuecomment-898713472, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FSZ5MZWG4J7PKJ6DA3T4WAVLANCNFSM4YVYMD5Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

--

George W Vandenberghe

IMSG at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

WenMeng-NOAA commented 3 years ago

The branch post_2d_decomp was just synced with upstream/develop.

HuiyaChuang-NOAA commented 3 years ago

@GeorgeVandenberghe-NOAA great news from last Friday. Will you commit your latest EXCH.f to The branch post_2d_decomp @JesseMeng-NOAA did you commit your updated SLP subroutines?

HuiyaChuang-NOAA commented 3 years ago

The branch post_2d_decomp was just synced with upstream/develop.

thank you for doing the update

GeorgeVandenberghe-NOAA commented 3 years ago

Easy to get the git syntax wrong. What are the command lines for doing this commit.

Also I need to remove some of the debug code and comment out the rest. I plan to change my plan for coordinate debug and use real rather than integer coordinate arrays for debugging so debug code can be placed outside of exch. I didn't do this before because of precision concerns embedding the coordinates in a single real location. I don't plan to do this until the next issue with exch if that happens.

On Mon, Aug 16, 2021 at 10:18 AM HuiyaChuang-NOAA @.***> wrote:

@GeorgeVandenberghe-NOAA https://github.com/GeorgeVandenberghe-NOAA great news from last Friday. Will you commit your latest EXCH.f to The branch post_2d_decomp @JesseMeng-NOAA https://github.com/JesseMeng-NOAA did you commit your updated SLP subroutines?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/EMC_post/issues/274#issuecomment-899549223, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FUFZEVRRFM6ZIF4IY3T5EM2XANCNFSM4YVYMD5Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

--

George W Vandenberghe

IMSG at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

JesseMeng-NOAA commented 3 years ago

I tested George's code with Wen's latest merge for numx=4 and got bitwise identical results. I will comment out most of George's print statements in EXCH and commit the changes today.

GeorgeVandenberghe-NOAA commented 3 years ago

There is a lot of extra code to to do coordinate exchange testing. Next time I have a scatter,gather, or exchange issue I will use real coordinates and do the debugging in the caller. However for now all of the code that does exchanges on INTEGERs need to be commented out.

 I am extremely tired, my son is home from school for a few days

(That's great!!) and I may just take some time off this week.

On Mon, Aug 16, 2021 at 11:43 AM Jesse Meng @.***> wrote:

I tested George's code with Wen's latest merge for numx=4 and got bitwise identical results. I will comment out most of George's print statements in EXCH and commit the changes today.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/EMC_post/issues/274#issuecomment-899613003, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FTY6RD43GOM5CMMEWLT5EWZTANCNFSM4YVYMD5Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

--

George W Vandenberghe

IMSG at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

JesseMeng-NOAA commented 3 years ago

After push to Wen's repo I received a message of Run failed: Build and Test. Github claimed an error in "use ifcore" in EXCH.f, Fatal Error: Cannot open module file ‘ifcore.mod’ for reading at (1): No such file or directory

But it works fine on wcoss, compiles and runs without errors. Does any one know the reason and solution?

GeorgeVandenberghe-NOAA commented 3 years ago

ifcore is an intel specific module used to generate runtime stack traces without terminating the program. Get rid of the ifcore use and the subsequent traceback call tracebackqq (which seems to have already been removed). Tracebackqq generates a traceback indicating the call chain above. I used it to determine who was calling exch.

Since it's intel specific it can't be used in generalized released code

Note EXCH is still exchanging coordinate arrays in a test EVERY call. My first thought was to get rid of it but on second thought, if we do more decomposition changes I'd like to go through this code and check for exact coordinate matches every time, just for now. If we do this it's important to grep for FAIL in stderr because I didn't make these errors fatal.

On Mon, Aug 16, 2021 at 12:45 PM Jesse Meng @.***> wrote:

After push to Wen's repo I received a message of Run failed: Build and Test. Github claimed an error in "use ifcore" in EXCH.f, Fatal Error: Cannot open module file ‘ifcore.mod’ for reading at (1): No such file or directory

But it works fine on wcoss, compiles and runs without errors. Does any one know the reason and solution?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/EMC_post/issues/274#issuecomment-899657779, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FUX2JKYD3S4LCCFBBLT5E6BVANCNFSM4YVYMD5Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

--

George W Vandenberghe

IMSG at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

JesseMeng-NOAA commented 3 years ago

It's good now after removing ifcore. Thanks you George! I will continue modifying all im loops to ista:iend style. Agree to keep those debugging code until we are fully tested with 2d decomposition for the complete variable list.

HuiyaChuang-NOAA commented 3 years ago

It's good now after removing ifcore. Thanks you George! I will continue modifying all im loops to ista:iend style. Agree to keep those debugging code until we are fully tested with 2d decomposition for the complete variable list.

Great!

HuiyaChuang-NOAA commented 3 years ago

@JesseMeng-NOAA @GeorgeVandenberghe-NOAA @WenMeng-NOAA @BoCui-NOAA This is to follow up on why exch and exch_f are identical in EXCH.f As I explained exch_f was for input array that were dimensioned (im,jm). However, now that Bo has updated UPP to dimension (im, jsta_2l,jend_2u), let's remove exch_f section.

@JesseMeng-NOAA please take on the task of removing exch_f section in EXCH.f but before you do that, you will need to modify 3 UPP subroutines that call exch_f to call exch instead. I checked how many UPP subroutines call exch_f and here is the inventory:

[Hui-Ya.Chuang@m71a2 ncep_post.fd]$ grep -i "call exch_f" .f .F AVIATION.f: call exch_f(U) AVIATION.f: call exch_f(V) AVIATION.f: call exch_f(U_OLD) AVIATION.f: call exch_f(V_OLD) AVIATION.f: call exch_f(H) AVIATION.f: call exch_f(H_OLD) CALMCVG.f: CALL EXCH_F(Q1D) CALMCVG.f: CALL EXCH_F(VWND) CALMCVG.f: CALL EXCH_F(QV) CALMCVG.f:! CALL EXCH_F(VWND) CALMCVG.f: CALL EXCH_F(UWND) CALVOR.f: CALL EXCH_F(UWND) CALVOR.f: CALL EXCH_F(VWND) CALVOR.f: CALL EXCH_F(VWND(1,jsta_2l,l)) CALVOR.f: CALL EXCH_F(PS)

JesseMeng-NOAA commented 3 years ago

I know. Thanks

HuiyaChuang-NOAA commented 3 years ago

@JesseMeng-NOAA Great!
I think you mentioned you've updated MDL2P.f In your current regression test, do you have pressure level output? If not, could you add them and run regression tests again?

JesseMeng-NOAA commented 3 years ago

Wen's test case only has hybrid level. I will try pressure level.

HuiyaChuang-NOAA commented 3 years ago

@JesseMeng-NOAA you can see https://github.com/WenMeng-NOAA/EMC_post/blob/post_2d_decomp/parm/postcntrl_gfs_f00.xml about adding isobaric level variables. Also, Can you report timing when running with different tasks at Thursday's tagup?

HuiyaChuang-NOAA commented 3 years ago

@GeorgeVandenberghe-NOAA @JesseMeng-NOAA @WenMeng-NOAA @BoCui-NOAA Summary for today's 2D decomposition tag up:

  1. George updated EXCH.f and it worked in his test
  2. Jesse tested George's new EXCH.f in his regression test that outputs model level variables and SLP as SLP computation calls SLP
  3. Jesse verified that his tests reproduced all above output with numx=1,2,4
  4. Jesse added isobaric height to his tests and tests still reproduced with different values of numx
  5. Jesse reported that timings are similar with different numbers of numx
  6. George indicated he added several debug statements in exch.f which will add overhead. He suggested to keep them in until 2D decomposition work is completed. Once these debug statements are removed, we should see improvements in timing.
  7. Huiya indicated only one UPP subroutine _calls EXCH2 which is SLP_NMM.f_ which consists of SLP computation for E grid
  8. Huiya indicated all INITPOST* calls collect to gather latitude and longitude

    INITPOST_GFS.f: call collect_loc(gdlat,dummy)

Action items:

  1. Huiya will email HWRF lead and DTC to indicate our intention to retire SLP_NMM.f after 2D decomposition branch is merged to devop
GeorgeVandenberghe-NOAA commented 3 years ago

I need a way to get COLLECT_LOC called. In the testcase supplied by Jesse it isn't so I can't verify it works although. of course I would NEVER make a misteak

On Fri, Aug 13, 2021 at 3:17 PM George Vandenberghe - NOAA Affiliate < @.***> wrote:

COLLECT is untested because I haven't gotten it called yet. I will send a tarball with the most recent versions. EXCH.f has a LOT of extra code right now to check the transfers.

I also haven't found anything calling EXCH2

it's on MARS /gpfs/dell2/pmp/gwv/post.0813.exch.tar

On Fri, Aug 13, 2021 at 2:05 PM JesseMeng-NOAA @.***> wrote:

Great! Please let me know where to grab your files. wcoss production switched yesterday. I have my files backup and transfer to mars, but I don't have yours. I need your MPI_FIRST.f, CTLBLK.f, and EXCH.f. Anything else you have modified? Did you also change the COLLECT?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/EMC_post/issues/274#issuecomment-898632319, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FXR76A5PKVMQX7AMXDT4VNGNANCNFSM4YVYMD5Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

--

George W Vandenberghe

IMSG at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

--

George W Vandenberghe

IMSG at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

JesseMeng-NOAA commented 3 years ago

The testcase Wen generated uses netcdfpara input handled by INITPOST_GFS_NETCDF_PARA.f in which call collect_loc is commented out. Wen, what do you think?

HuiyaChuang-NOAA commented 3 years ago

@GeorgeVandenberghe-NOAA @JesseMeng-NOAA @BoCui-NOAA @WenMeng-NOAA actually I asked Bo to comment that section out until we're ready to test collect_loc.
Jesse please uncomment the section to help George test out his version of collect_loc

WenMeng-NOAA commented 3 years ago

The branch post_2d_decomp was synced with upstream/develop. The conflicts were solved.

GeorgeVandenberghe-NOAA commented 3 years ago

Does this mean that a clone of develop has some support for 2D decomposition?

On the COLLECT_LOC testing issue I can test it with a call to gather the coordinate array, inverting my scatter test. If this works it's going to work for other arrays.

grib2_module.f uses a different array to gather with possibly different shape on the remote ranks. The shape of the gathered object is the same, a full domain array. It's just bookkeeping but new and additional because of the different remote side shape.

On Thu, Sep 2, 2021 at 1:22 PM WenMeng-NOAA @.***> wrote:

The branch post_2d_decomp was synced with upstream/develop. The conflicts were solved.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/EMC_post/issues/274#issuecomment-911900753, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FUPLRS7PLMYCWFD2ITT76XD3ANCNFSM4YVYMD5Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

--

George W Vandenberghe

IMSG at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

WenMeng-NOAA commented 3 years ago

@GeorgeVandenberghe-NOAA No, my updates are syncing the latest changes of branch develop at authoritative UPP repso. in the 2d decomposition developing branch.

HuiyaChuang-NOAA commented 3 years ago

@GeorgeVandenberghe-NOAA @JesseMeng-NOAA @WenMeng-NOAA @BoCui-NOAA Summary for today's 2D decomposition tag up:

  1. George updated EXCH.f and it worked in his test
  2. Jesse tested George's new EXCH.f in his regression test that outputs model level variables and SLP as SLP computation calls SLP
  3. Jesse verified that his tests reproduced all above output with numx=1,2,4
  4. Jesse added isobaric height to his tests and tests still reproduced with different values of numx
  5. Jesse reported that timings are similar with different numbers of numx
  6. George indicated he added several debug statements in exch.f which will add overhead. He suggested to keep them in until 2D decomposition work is completed. Once these debug statements are removed, we should see improvements in timing.
  7. Huiya indicated only one UPP subroutine _calls EXCH2 which is SLP_NMM.f_ which consists of SLP computation for E grid
  8. Huiya indicated all INITPOST* calls collect to gather latitude and longitude

INITPOST_GFS.f: call collect_loc(gdlat,dummy)

Action items:

  1. Huiya will email HWRF lead and DTC to indicate our intention to retire SLP_NMM.f after 2D decomposition branch is merged to devop

Huiya emailed Zhang and Kate from DTC indicating our plan to retire SLP_NMM,f as it only is used for HWRF operationally and HWRF will be replaced by HAFS. Additionally, HWRF maintains its own branch.

  1. Zhang indicated the current plan is to replace HWRF with HAFS in 2023. He expressed concerns for HWRF UPP support prior to that but we assured him the UPP HWRF branch will not be changed unless necessary.
  2. Kate was fine with the plan.
HuiyaChuang-NOAA commented 3 years ago

@JesseMeng-NOAA @BoCui-NOAA @WenMeng-NOAA @GeorgeVandenberghe-NOAA Per discussion from tag up on this project yesterday, we will plan on retiring SLP_NMM.f and EXCH2.f in this branch.

HuiyaChuang-NOAA commented 3 years ago

@JesseMeng-NOAA Jesse, please remove the following list of UPP subroutines from 2D decomposition branch:

  1. SLP_NMM.f
  2. EXCH2.f
  3. INITPOST_SIGIO.f
  4. INITPOST_GFS_NEMS.f

Thank you.

WenMeng-NOAA commented 3 years ago

@HuiyaChuang-NOAA I could remove the INITPOST_SIGIO.f and INITPOST_GFS_NEMS.f in UPP develop branch before 2D decomposition branch merging.

GeorgeVandenberghe-NOAA commented 3 years ago

Are these still used for older applications that use GFS NEMSIO and even older SIGIO files? Or did the developers fork their own versions long ago as the NAM people did?

On Fri, Sep 17, 2021 at 2:22 PM WenMeng-NOAA @.***> wrote:

@HuiyaChuang-NOAA https://github.com/HuiyaChuang-NOAA I could remove the INITPOST_SIGIO.f and INITPOST_GFS_NEMS.f in UPP develop branch before 2D decomposition branch merging.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/UPP/issues/274#issuecomment-921993759, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FTIVUDAC6SGWILHEVDUCOBMPANCNFSM4YVYMD5Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

--

George W Vandenberghe

IMSG at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

WenMeng-NOAA commented 3 years ago

These are legacy read interfaces which are not used in operational applications any more. We still keep the interface for NAM.

HuiyaChuang-NOAA commented 3 years ago

These are legacy read interfaces which are not used in operational applications any more. We still keep the interface for NAM.

@WenMeng-NOAA thank you. that would be great. @GeorgeVandenberghe-NOAA you're right that these two subroutines are read interfaces for old GFS sigio and serial nems io output. However, we're still keeping INITPOST_GFS_NEMS_MPIIO.f which was what's used operationally to parallel read nemsio output.

JesseMeng-NOAA commented 3 years ago

Somehow I cannot get rid of INITPOST_GFS_NEMS. Does not compile without it. I have commented out the call from WRFPOST and doublechecked that no other subroutines call INITPOST_GFS_NEMS. Just let it sit there for now until we know the reason.

hertneky commented 3 years ago

@JesseMeng-NOAA I ran across this issue when doing a bit or Grib1 cleanup of RQSTFLDS.f. I have been able to remove INITPOST_GFS_NEMS.f and build successfully with some additional changes to WRFPOST.f and thought I would pass this information along. I made changes made in WRFPOST.f to use the 'nemsio_module_mpi' instead of 'nemsio_module' since otherwise there was an error with building related to INITPOST_GFS_NEMS.f being removed from the CMakeLists. You can see the changes via https://github.com/hertneky/UPP/tree/th_grib1_cleanup

Note: this is work that is part of issue #344

JesseMeng-NOAA commented 3 years ago

@hertneky Thanks for your help. However, I tried the same thing you did for WRFPOST but still have the link errors as

/gpfs/dell2/usrx/local/nceplibs/dev/hpc-stack/libs/hpc-stack/ips-18.0.1.163/impi-18.0.1/nemsio/2.5.2/lib/libnemsio.a(nemsio_read.f90.o): In function nemsio_read_mp_nemsio_readrecgrb8_': /usrx/local/nceplibs/dev/hpc-stack/src/hpc-stack/pkg/nemsio-v2.5.2/src/nemsio_read.f90:979: undefined reference togetgbm_' /gpfs/dell2/usrx/local/nceplibs/dev/hpc-stack/libs/hpc-stack/ips-18.0.1.163/impi-18.0.1/nemsio/2.5.2/lib/libnemsio.a(nemsio_read.f90.o): In function nemsio_read_mp_nemsio_readrecvgrb8_': /usrx/local/nceplibs/dev/hpc-stack/src/hpc-stack/pkg/nemsio-v2.5.2/src/nemsio_read.f90:1032: undefined reference togetgbm_' /gpfs/dell2/usrx/local/nceplibs/dev/hpc-stack/libs/hpc-stack/ips-18.0.1.163/impi-18.0.1/nemsio/2.5.2/lib/libnemsio.a(nemsio_read.f90.o): In function nemsio_read_mp_nemsio_readrecgrb4w34_': /usrx/local/nceplibs/dev/hpc-stack/src/hpc-stack/pkg/nemsio-v2.5.2/src/nemsio_read.f90:870: undefined reference togetgbm_' /gpfs/dell2/usrx/local/nceplibs/dev/hpc-stack/libs/hpc-stack/ips-18.0.1.163/impi-18.0.1/nemsio/2.5.2/lib/libnemsio.a(nemsio_read.f90.o): In function nemsio_read_mp_nemsio_readrecvgrb4w34_': /usrx/local/nceplibs/dev/hpc-stack/src/hpc-stack/pkg/nemsio-v2.5.2/src/nemsio_read.f90:924: undefined reference togetgbm_'

I am testing on wcoss dell phase 3. What machine and what version of libnemsio.a do you use?

@JesseMeng-NOAA I ran across this issue when doing a bit or Grib1 cleanup of RQSTFLDS.f. I have been able to remove INITPOST_GFS_NEMS.f and build successfully with some additional changes to WRFPOST.f and thought I would pass this information along. I made changes made in WRFPOST.f to use the 'nemsio_module_mpi' instead of 'nemsio_module' since otherwise there was an error with building related to INITPOST_GFS_NEMS.f being removed from the CMakeLists. You can see the changes via https://github.com/hertneky/UPP/tree/th_grib1_cleanup

Note: this is work that is part of issue #344

WenMeng-NOAA commented 3 years ago

@JesseMeng-NOAA and @HuiyaChuang-NOAA I would suggest removing legacy INITPOST_SIGIO.f and INITPOST_GFS_NEMS.f in Tracy's PR which would be merged in develop branch before 2D decomposition PR. Please let me know your thoughts.

JesseMeng-NOAA commented 3 years ago

Sounds good.

@JesseMeng-NOAA and @HuiyaChuang-NOAA I would suggest removing legacy INITPOST_SIGIO.f and INITPOST_GFS_NEMS.f in Tracy's PR which would be merged in develop branch before 2D decomposition PR. Please let me know your thoughts.

hertneky commented 3 years ago

@JesseMeng-NOAA I was testing on NCARs Cheyenne initially. The nemsio library used was version 2.5.2. I will test on Hera as well.

WenMeng-NOAA commented 3 years ago

Sounds good.

@JesseMeng-NOAA and @HuiyaChuang-NOAA I would suggest removing legacy INITPOST_SIGIO.f and INITPOST_GFS_NEMS.f in Tracy's PR which would be merged in develop branch before 2D decomposition PR. Please let me know your thoughts.

@JesseMeng-NOAA Could you back up the changes for removing these two routines in branch post_2d_decomp ? Thanks!

JesseMeng-NOAA commented 3 years ago

Sounds good.

@JesseMeng-NOAA and @HuiyaChuang-NOAA I would suggest removing legacy INITPOST_SIGIO.f and INITPOST_GFS_NEMS.f in Tracy's PR which would be merged in develop branch before 2D decomposition PR. Please let me know your thoughts.

@JesseMeng-NOAA Could you back up the changes for removing these two routines in branch post_2d_decomp ? Thanks!

INITPOST_GFS_SIGIO.f was removed. I am keeping INITPOST_GFS_NEMS.f, but commented out all calls, until we have a solution to compile without it.

GeorgeVandenberghe-NOAA commented 3 years ago

Turns out the post doesn't have a generalized namelist read. The closest analog is itag. I could add a line in itag but then versions that don't yet support 2D would break. I could try to read it in itag and branch around a failure. Or I could put it in the environment or another file and if the file weren't there, assume default numx=1. Considering options.

Wen's 2d Post currently passes all of the use case regression tests on Jet where I first checked including the ones where we are not considering 2D enhancement

GeorgeVandenberghe-NOAA commented 3 years ago

I have modified the numx setting in the 2D post to be done in WRFPOST.f with the value of numx read from a line in itag (numx=4) for example. This would be the first line in itag and if absent, the value DEFAULTS to 1 so current itags can still be used and cause the code to fall back to a 1D decomposition.

It requires disabling the hardwire in MPI_FIRST.f, adding it to CTLBLK.mod and setting it with a read or default value in WRFPOST.f

This all passes regression tests on Jet (which basically test the fallback to 1D in the old itags)

On Fri, Sep 17, 2021 at 3:15 PM Jesse Meng @.***> wrote:

Will do

Dr. Jesse Meng IMSG at NOAA/NWS/NCEP/EMC 5830 University Research Ct. Room 2037 College Park, MD 20740 email: @.***

On Fri, Sep 17, 2021 at 2:12 PM HuiyaChuang-NOAA @.***> wrote:

@JesseMeng-NOAA https://github.com/JesseMeng-NOAA Jesse, please remove the following list of UPP subroutines from 2D decomposition branch:

  1. SLP_NMM.f
  2. EXCH2.f
  3. INITPOST_SIGIO.f
  4. INITPOST_GFS_NEMS.f

Thank you.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/UPP/issues/274#issuecomment-921987910, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALXNRCPWP6OIT3R7KL7MEDTUCOAHLANCNFSM4YVYMD5Q

. Triage notifications on the go with GitHub Mobile for iOS < https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android < https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/UPP/issues/274#issuecomment-922022940, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FUTTR2CS23FAEX75G3UCOHT5ANCNFSM4YVYMD5Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

--

George W Vandenberghe

IMSG at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

JesseMeng-NOAA commented 3 years ago

Great! Please let me know where to grab your files. Thanks!

GeorgeVandenberghe-NOAA commented 3 years ago

/gpfs/dell3/ptmp/gwv/post.runtime2d.tar

To actually use it place numx=$something in the FIRST line of itag. WRFPOST.f checks for this and backspaces itag to the first line if it isn't present so the rest of the itag input is unaffected

On Thu, Sep 30, 2021 at 11:47 AM Jesse Meng @.***> wrote:

Great! Please let me know where to grab your files. Thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/UPP/issues/274#issuecomment-931444383, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FXUC5TJKWUUHB3ZAG3UESBCHANCNFSM4YVYMD5Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

--

George W Vandenberghe

IMSG at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

JesseMeng-NOAA commented 3 years ago

Thanks George! It works great! numx is read from itag, with default =1 if not present in itag. Both cases generate identical results in my 2d test.

HuiyaChuang-NOAA commented 3 years ago

Turns out the post doesn't have a generalized namelist read. The closest analog is itag. I could add a line in itag but then versions that don't yet support 2D would break. I could try to read it in itag and branch around a failure. Or I could put it in the environment or another file and if the file weren't there, assume default numx=1. Considering options.

Wen's 2d Post currently passes all of the use case regression tests on Jet where I first checked including the ones where we are not considering 2D enhancement

@GeorgeVandenberghe-NOAA @JesseMeng-NOAA Sorry I meant to reply earlier but was attending DTC SAB meetings. UPP does read in a namelist "nampgb" namelist/nampgb/kpo,po,kth,th,kpv,pv,fileNameAER,d3d_on,gocart_on,popascal & ,hyb_sigp,rdaod,aqfcmaq_on I think this would be an ideal place to add numx. I like George's idea of setting numx=1 as default. If you look at WRFPOST.f, you can see UPP is setting default too for all namelist variables.

fossell commented 3 years ago

@HuiyaChuang-NOAA @WenMeng-NOAA @GeorgeVandenberghe-NOAA @JesseMeng-NOAA - I also meant to comment. DTC's @kayeekayee is working on some development to translate all info in the itag to a formal fortran namelist (Issue #115) and has the work largely done and tested with current develop branch. Woudl this be of interest to review and push this feature enhancement to help aid the 2d decomp work?