oceanmodeling / ufs-weather-model

This repo is forked from ufs-weather-model, and contains the model code and external links needed to build the UFS coastal model executable and model components, including the ROMS, FVCOM, ADCIRC and SCHISM plus WaveWatch III model components.
https://github.com/oceanmodeling/ufs-coastal-app
Other
2 stars 3 forks source link

Go through UFS-CAT tutorial steps for running SCHISM #75

Closed janahaddad closed 1 month ago

janahaddad commented 2 months ago

https://drive.google.com/drive/folders/1uQt3x8_O2dV7g9nD9y5bSuSRHoPOqsEb?hl=en UFS-CAT tutorial docs

Armaghan-NOAA commented 2 months ago

@uturuncoglu and/or @pvelissariou1 I do not have access permission to the folder below:

/work2/noaa/nems/tufuk/RT which I believe is needed for running rt according to:

(https://drive.google.com/drive/folders/1uQt3x8_O2dV7g9nD9y5bSuSRHoPOqsEb?hl=en)

Could you please open the permission? (I am not part of nems project though so maybe you can copy the needed folder (RT) to any other projects that I am a part of): noaa-hpc nosofs nos-surge

for this specific example I am using this path: /work2/noaa/nos-surge/aabed/rt

pvelissariou1 commented 2 months ago

@Armaghan-NOAA , Armaghan try to use /work/noaa/nems/tufuk/RT instead, see if that works for you. Edit the rt.sh script and change the line from: DISKNM=/work/noaa/nosofs/${USER}/RT to: DISKNM=/work/noaa/nems/tufuk/RT that is, search in rt,sh for "orion" and "hercules" to find the DISKNM= lines for these two platforms.

Armaghan-NOAA commented 1 month ago

@pvelissariou1 @uturuncoglu I got this error: (base) [armaghan@hercules-login-3 tests]$ ./rt.sh -l rt_coastal.conf -a nems -k -n coastal_ike_shinnecock_atm2sch intel **Regression Testing Script Started** hercules-login-3.hpc.msstate.edu The -n option needs [testname] AND [compiler] in quotes, i.e. -n "control_p8 intel" rt.sh finished rt.sh: Cleaning up... rt.sh: Exiting.

Please let me know how to solve this in hercules. Also I do not have any modules loaded yet following the recorded training. is it not needed?

uturuncoglu commented 1 month ago

@Armaghan-NOAA Please try with following, ./rt.sh -l rt_coastal.conf -a nems -k -n "coastal_ike_shinnecock_atm2sch intel". The way of running regression test slightly changed with the recent sync. Please document all issues and then we could update the app level documentation. We have also issues with -l along with -n. Let me sync the model again to get the fix from ufs-weather-model level.

pvelissariou1 commented 1 month ago

@uturuncoglu She doesn't have a nema account: most likely: ./rt.sh -l rt_coastal.conf -a coast(or coastal) -k -n "coastal_ike_shinnecock_atm2sch intel"

uturuncoglu commented 1 month ago

@pvelissariou1 Yes. Correct. Sorry for confusion. I also sync the model again and push the recent changes soon. That will fix the issue with rt.sh

pvelissariou1 commented 1 month ago

@uturuncoglu Thanks, after the sync I wiill check on my side as well.

uturuncoglu commented 1 month ago

@pvelissariou1 BTW, it seems there is also minor type issue on uf-weather-model develop - https://github.com/ufs-community/ufs-weather-model/blob/04bbc15f9abfb25a8864cdeaebdbd439c4332c95/tests/rt.sh#L809 but since we are modifying the folder, it is fine. I am trying to add it to my land PR in UFS level.

Armaghan-NOAA commented 1 month ago

@pvelissariou1 @uturuncoglu just to inform I also tried coastal and coast none worked:

(base) [armaghan@hercules-login-3 tests]$ ./rt.sh -l rt_coastal.conf -a coastal -k -n coastal_ike_shinnecock_atm2sch intel **Regression Testing Script Started** hercules-login-3.hpc.msstate.edu The -n option needs [testname] AND [compiler] in quotes, i.e. -n "control_p8 intel" rt.sh finished rt.sh: Cleaning up... rt.sh: Exiting.

(base) [armaghan@hercules-login-3 tests]$ ./rt.sh -l rt_coastal.conf -a coast -k -n coastal_ike_shinnecock_atm2sch intel **Regression Testing Script Started** hercules-login-3.hpc.msstate.edu The -n option needs [testname] AND [compiler] in quotes, i.e. -n "control_p8 intel" rt.sh finished rt.sh: Cleaning up... rt.sh: Exiting.

Please let me know whenever you think the system is ready for me to try again. Thank you.

@pvelissariou1 @uturuncoglu, I am a member of this projects using the following command: (base) [armaghan@hercules-login-3 tests]$ groups noaa-hpc nosofs nos-surge

pvelissariou1 commented 1 month ago

@Armaghan-NOAA You should have access to at leat one of nos-surge, coast accounts. To check what accounts you have access to, do:

Armaghan-NOAA commented 1 month ago

@Armaghan-NOAA You should have access to at leat one of nos-surge, coast accounts. To check what accounts you have access to, do:

  • module load contrib noaatools
  • run from the commandline: saccount_params The last command will show you the list of the accounts you are in.

Here is what I get: (base) [armaghan@hercules-login-3 tests]$ module load contrib noaatools (base) [armaghan@hercules-login-3 tests]$ saccount_params Account Params -- Information regarding project associations for armaghan Home Quota (/home/armaghan) Used: 642 MB Quota: 8192 MB Grace: 10240

    Project: noaatest
            Directory: /work/noaa/noaatest DiskInUse=0 GB, Quota=0 GB, Files=0, FileQUota=0
            Directory: /work2/noaa/noaatest DiskInUse=0 GB, Quota=0 GB, Files=0, FileQUota=0

    Project: nos-surge
            FairShare=0.053 (47/53)
            Partition Access: ALL
            Available QOSes: batch,debug,novel,ood,urgent,windfall

            Directory: /work2/noaa/nos-surge DiskInUse=78870 GB, Quota=95000 GB, Files=10256451, FileQUota=0

    Project: nosofs
            FairShare=0.630 (39/53)
            Partition Access: ALL
            Available QOSes: batch,debug,novel,ood,urgent,windfall

            Directory: /work/noaa/nosofs DiskInUse=45385 GB, Quota=47500 GB, Files=7021336, FileQUota=0
            Directory: /work2/noaa/nosofs DiskInUse=5199 GB, Quota=71250 GB, Files=30614, FileQUota=0

Note: for an explanation of the meaning of these values and general scheduling information see: https://rdhpcs-common-docs.rdhpcs.noaa.gov/wiki/index.php/SLURM_Fair-share Note: the parenthetical values after project fairshare indiciate the rank of the project with respect to all other allocated projects. If the first number is lower, your project is likely to have higher priority than other projects. (Of course, other factors weigh in to scheduling.)

@pvelissariou1 so I see noaatest, nos-surge, and nosofs

pvelissariou1 commented 1 month ago

Ok, try to use the nos-surge account

uturuncoglu commented 1 month ago

@Armaghan-NOAA BTW, I think you are still not putting test name along with compiler between quotas. Also, I did not push the sync (incl. fix related with rt.sh) yet. Still waiting to finish the tests on Orion.

Armaghan-NOAA commented 1 month ago

Ok, try to use the nos-surge account

got the same error: (base) [armaghan@hercules-login-3 tests]$ ./rt.sh -l rt_coastal.conf -a nos-surge -k -n coastal_ike_shinnecock_atm2sch intel **Regression Testing Script Started** hercules-login-3.hpc.msstate.edu The -n option needs [testname] AND [compiler] in quotes, i.e. -n "control_p8 intel" rt.sh finished rt.sh: Cleaning up... rt.sh: Exiting.

Is the command I am using correct?

pvelissariou1 commented 1 month ago

Try: ./rt.sh -l rt_coastal.conf -a nos-surge -k -n "coastal_ike_shinnecock_atm2sch intel"

uturuncoglu commented 1 month ago

@Armaghan-NOAA @pvelissariou1 Again. This will not work until I push the sync. It will run the first job in the file.

uturuncoglu commented 1 month ago

@Armaghan-NOAA @pvelissariou1 Okay. I have just synced again. So, we have the fix in ufs-coastal too.

Armaghan-NOAA commented 1 month ago

@pvelissariou1 @uturuncoglu it seems this time it runs but I am getting two other errors (codes 127 & 843) shown below. Could you help?

(base) [armaghan@hercules-login-3 tests]$ ./rt.sh -l rt_coastal.conf -a nos-surge -k -n "coastal_ike_shinnecock_atm2sch intel" **Regression Testing Script Started** hercules-login-3.hpc.msstate.edu Machine: hercules Account: nos-surge rt.sh: Setting up hercules...

pvelissariou1 commented 1 month ago

@Armaghan-NOAA what is this: DISKNM = DISKNM=/work/noaa/nems/tufuk/RT I see in your log? just replace DISKNM=/work/noaa/epic/hercules/UFS-WM_RT by: DISKNM=/work/noaa/nems/tufuk/RT have you done that? Do you want to set a meeting Thursday after 11:00 CTD? Have cloned the updated ufs-coastal?

Armaghan-NOAA commented 1 month ago

@Armaghan-NOAA what is this: DISKNM = DISKNM=/work/noaa/nems/tufuk/RT I see in your log? just replace DISKNM=/work/noaa/epic/hercules/UFS-WM_RT by: DISKNM=/work/noaa/nems/tufuk/RT have you done that? Do you want to set a meeting Thursday after 11:00 CTD? Have cloned the updated ufs-coastal?

Thanks for pointing that out. I am now getting this error: ERROR: STMP: /work2/noaa/stmp/armaghan/stmp -- DOES NOT EXIST Could you guide? @pvelissariou1 @uturuncoglu

uturuncoglu commented 1 month ago

I think you set the DISKNM wrong. In the log it is something like DISKNM = DISKNM=/work/noaa/nems/tufuk/RT So, it needs to be DISKNM=/work/noaa/nems/tufuk/RT.

pvelissariou1 commented 1 month ago

@Armaghan-NOAA You better change it to: /work2/noaa/armaghan/stmp. In your rt.sh file go to hercules block and change: dprefix=/work2/noaa/stmp/${USER} to: dprefix=/work2/noaa/${USER}/stmp

uturuncoglu commented 1 month ago

@Armaghan-NOAA you could create that directory by your hand. Just issue mkdir -p /work2/noaa/stmp/armaghan/stmp

Armaghan-NOAA commented 1 month ago

@Armaghan-NOAA you could create that directory by your hand. Just issue mkdir -p /work2/noaa/stmp/armaghan/stmp

@uturuncoglu I do not have permission to make a folder there:

(base) [armaghan@hercules-login-3 stmp]$ pwd /work2/noaa/stmp (base) [armaghan@hercules-login-3 stmp]$ mkdir armaghan mkdir: cannot create directory ‘armaghan’: Permission denied

Armaghan-NOAA commented 1 month ago

@Armaghan-NOAA You better change it to: /work2/noaa/armaghan/stmp. In your rt.sh file go to hercules block and change: dprefix=/work2/noaa/stmp/${USER} to: dprefix=/work2/noaa/${USER}/stmp

@pvelissariou1 got this error: ERROR: STMP: /work2/noaa/armaghan/stmp/stmp -- DOES NOT EXIST

pvelissariou1 commented 1 month ago

@Armaghan-NOAA As the error says you need to create the parent directory manually. Anyway, I checked hercules and orion to find out where your user directory in /work and /work2 is. You should have an armaghan (this is your user name) folder in /work, /work2 filesystems hercules/orion /work/noaa/nosofs : no_user_folder_found, need to create one /work2/noaa/nos-surge : found aabed, need to rename it to: armaghan /work2/noaa/nosofs : no_user_folder_found, need to create one commands: mkdir -p /work/noaa/nosofs/${USER} mkdir -p /work2/noaa/nosofs/${USER} mv /work2/noaa/nos-surge/aabed /work2/noaa/nos-surge/${USER}

Then adjust the dprefix variable for orion/hercules in your rt.sh accordingly, as: dprefix=/work2/noaa/nos-surge/${USER} (the stmp folder will be created for you) OR if you prefer: dprefix=/work2/noaa/nosofs/${USER} NOTE: /work2/noaa/nos-surge has about 15 TB disk space left and /work2/noaa/nosofs has about 65TB disk space left

Armaghan-NOAA commented 1 month ago

@Armaghan-NOAA As the error says you need to create the parent directory manually. Anyway, I checked hercules and orion to find out where your user directory in /work and /work2 is. You should have an armaghan (this is your user name) folder in /work, /work2 filesystems hercules/orion /work/noaa/nosofs : no_user_folder_found, need to create one /work2/noaa/nos-surge : found aabed, need to rename it to: armaghan /work2/noaa/nosofs : no_user_folder_found, need to create one commands: mkdir -p /work/noaa/nosofs/${USER} mkdir -p /work2/noaa/nosofs/${USER} mv /work2/noaa/nos-surge/aabed /work2/noaa/nos-surge/${USER}

Then adjust the dprefix variable for orion/hercules in your rt.sh accordingly, as: dprefix=/work2/noaa/nos-surge/${USER} (the stmp folder will be created for you) OR if you prefer: dprefix=/work2/noaa/nosofs/${USER} NOTE: /work2/noaa/nos-surge has about 15 TB disk space left and /work2/noaa/nosofs has about 65TB disk space left

@pvelissariou1 I am getting this error now: ERROR: STMP: /work2/noaa/armaghan/stmp/stmp -- DOES NOT EXIST why is it trying to access armaghan folder in noaa although I specify it is in the nos-surge?

pvelissariou1 commented 1 month ago

@Armaghan-NOAA You are setting the variables in your rt.sh incorectrly. All paths are like: /work2/noaa/nos-surge/armaghan or /work2/noaa/nosofs/armaghan set the dprefix variable in your rt.sh like: dprefix=/work2/noaa/nos-surge/${USER} If somehow still complains that it can not find the stmp directory than create it manually: mkdir -p /work2/noaa/nos-surge/${USER}/stmp

Armaghan-NOAA commented 1 month ago

@pvelissariou1 I could fix the issue and make the path needed. Here is the new error: find: ‘/work/noaa/nems/tufuk/RT/NEMSfv3gfs/develop-20240417/’: No such file or directory rt.sh: Getting error information...

pvelissariou1 commented 1 month ago

@Armaghan-NOAA , @janahaddad If you want we can have a meeting tomorrow (after 11:00pm CTD) to go through this and other items in UFS-Coastal

Armaghan-NOAA commented 1 month ago

@pvelissariou1 that would be great if we can meet after 11 tomorrow.

janahaddad commented 1 month ago

@pvelissariou1 @Armaghan-NOAA may I suggest we review at Monday's UFS-Coastal tag-up, if Takis is able to join?

pvelissariou1 commented 1 month ago

I will join Monday as well as tomorrow

janahaddad commented 1 month ago

Ok, thanks Takis

janahaddad commented 1 month ago

Per our meeting earlier, this now works for me on Hera using -c option to create new baseline

a couple notes:

(base) [Jana.Haddad@hfe10 tests]$ ./rt.sh -l rt_coastal.conf -a coastal -k -n -c "coastal_ike_shinnecock_atm2sch intel"
******Regression Testing Script Started******
hfe10
The -n option needs [testname] AND [compiler] in quotes, i.e. -n "control_p8 intel"
rt.sh finished
rt.sh: Cleaning up...
rt.sh: Exiting.

correct flag order:

(base) [Jana.Haddad@hfe10 tests]$ ./rt.sh -l rt_coastal.conf -a coastal -c -k -n "coastal_ike_shinnecock_atm2sch intel"
******Regression Testing Script Started******
hfe10
Machine: hera
Account: coastal
rt.sh: Setting up hera...
Armaghan-NOAA commented 1 month ago

I also ran the UFS-CAT in Hercules. I have the outputs. Next steps would be merging outputs. @pvelissariou1 @janahaddad could you explain more what merging outputs do and how it helps?

pvelissariou1 commented 1 month ago

@Armaghan-NOAA The program to use for merging the outputs is combine_output11. Make sure that you load the same modules you loaded when compiling ufs-coastal before running the above program. Example: module use YOUR_UFS_COASTAL_DIR/modulefiles and then module load ufs_hercules.intel. Run combine_output11 -h to see the available options. Change directory where the outputs directory is loacated (not inside outputs); combine_outpt11 is searching for an outputs directory. Then run the program to generate the combined output files.

pvelissariou1 commented 1 month ago

Merging the outputs generate the overall netcdf output file that contains the data for all times including the timestamps.

On Thursday, May 9, 2024, Armaghan Abed-Elmdoust @.***> wrote:

I also ran the UFS-CAT in Hercules. I have the outputs. Next steps would be merging outputs. @pvelissariou1 @janahaddad could you explain more what merging outputs do and how it helps?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.< https://ci3.googleusercontent.com/meips/ADKq_NaJmLojE4hL1rM7-eNXP_CFZl7foFWumxbWP1zVfUIGFVZY546oCP4t-5rnjDoM2Iz9EHMXKOYcQiErpYjXppcYqxkm7696s-tN044BJBUBSpOMmyHpySAs5taG1hHXjd9k6nSJx3-c0hZkfqa741SLdMO2z-zSRiEgC1xSSRniQzgpn-fy04UPHdCkAjOPOntDohnX-NzzhVe5sOZGNTZt7_rMb2WJDyI86XyZsNFff8O0ZUix0qA=s0-d-e1-ft#https://github.com/notifications/beacon/APC7TPZDTDDCURAPEJPJELTZBN6FTA5CNFSM6AAAAABG6Z36VKWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTT5KTR4A.gif>Message ID: @.***>

-- Panagiotis Velissariou, Ph.D., P.E. Scientist III OAI at the Office of Coast Survey CSDL/CMMB National Ocean Service National Ocean and Atmospheric Administration cell: (205) 227-9141 email: @.***

janahaddad commented 1 month ago

@Armaghan-NOAA successfully ran this RT, closing with new ticket #94

Armaghan-NOAA commented 1 month ago

@Armaghan-NOAA successfully ran this RT, closing with new ticket #94

The last outputs of running UFS-CAT is in this path: /work2/noaa/nosofs/armaghan/stmp/armaghan/FV3_RT/REGRESSION_TEST/coastal_ike_shinnecock_atm2sch_intel/outputs

Armaghan-NOAA commented 1 month ago

https://drive.google.com/drive/folders/1uQt3x8_O2dV7g9nD9y5bSuSRHoPOqsEb?hl=en UFS-CAT tutorial docs

another resource would be: https://github.com/oceanmodeling/ufs-coastal/discussions/46