ufs-community / ufs-mrweather-app

UFS Medium-Range Weather Application
Other
23 stars 23 forks source link

CIME issues on Stampede #137

Closed climbfuji closed 4 years ago

climbfuji commented 4 years ago

git clone https://github.com/ufs-community/ufs-mrweather-app ufs-mrweather-app-20200306 cd ufs-mrweather-app-20200306 ./manage_externals/checkout_externals export UFS_INPUT=$PWD/projects export UFS_SCRATCH=$PWD/projects/scratch mkdir -p $UFS_SCRATCH mkdir -p $UFS_INPUT/ufs_inputdata export PROJECT=TG-MCA95C006 . $WORK/NCEPLIBS-ufs-v1.0.0/bin/setenv_nceplibs.sh ./cime/scripts/create_newcase --case c96-gfsv15p2 --compset GFSv15p2 --res C96 --workflow ufs-mrweather --machine stampede2-skx rm -fr /scratch/06146/tg854455/c96-gfsv15p2 ./cime/scripts/create_newcase --case c96-gfsv15p2 --compset GFSv15p2 --res C96 --workflow ufs-mrweather --machine stampede2-skx cd c96-gfsv15p2 ./case.setup ./preview_run ./case.build ./case.submit squeue -l -u $USER

Fri Mar  6 16:00:12 2020
             JOBID   PARTITION     NAME     USER    STATE       TIME TIME_LIMI  NODES NODELIST(REASON)
           5342419  skx-normal gfs_post tg854455  PENDING       0:00   1:00:00      1 (DependencyNeverSatisfied)

cat CaseStatus

login2(795)$ cat CaseStatus
2020-03-06 15:41:38: case.setup starting
 ---------------------------------------------------
2020-03-06 15:41:39: case.setup success
 ---------------------------------------------------
2020-03-06 15:41:58: case.build starting
 ---------------------------------------------------
2020-03-06 15:49:05: case.build success
 ---------------------------------------------------
2020-03-06 15:49:05: case.submit starting
 ---------------------------------------------------
2020-03-06 15:50:13: case.submit success case.chgres:5342417, case.run:5342418, case.gfs_post:5342419
 ---------------------------------------------------
2020-03-06 15:50:30: case.run starting
 ---------------------------------------------------
2020-03-06 15:50:30: case.run error
ERROR: Undefined env var 'UFS_INPUT'
 ---------------------------------------------------
climbfuji commented 4 years ago

Stampede, obviously I am doing something wrong, but I am doing the same thing as I do on all other supported platforms.

uturuncoglu commented 4 years ago

I am testing on Stampede and it works without any problem. I'll test with the head again.

climbfuji commented 4 years ago

Thanks, appreciate it!

uturuncoglu commented 4 years ago

BTW, couple of minutes ago I got following error on Stampede, just to be sure, could you try again.

 kernel:LustreError: 132257:0:(llite_lib.c:2301:ll_delete_inode()) ASSERTION( inode->i_data.nrpages == 0 ) failed: inode=[0x13000dfad5:0xb926:0x0](ffff9f3bfcbabd08) nrpages=1, see https://jira.hpdd.intel.com/browse/LU-118

Message from syslogd@login3.stampede2.tacc.utexas.edu at Mar  6 15:56:02 ...
 kernel:LustreError: 132257:0:(llite_lib.c:2301:ll_delete_inode()) LBUG

Message from syslogd@login3.stampede2.tacc.utexas.edu at Mar  6 15:56:02 ...
 kernel:Kernel panic - not syncing: LBUG
packet_write_wait: Connection to 129.114.63.43 port 22: Broken pipe
climbfuji commented 4 years ago

I did, the error of an undefined variable has nothing to do with the kernel crash (this one kicked me out as well).

uturuncoglu commented 4 years ago

Okay. i have just clone it and let you know soon about it.

uturuncoglu commented 4 years ago

I know your problem. You have to set those environment variables in your .bashrc. This is a restriction for TACC systems. Please look at Section 7.1 in the doc

https://ufs-mrweather-app.readthedocs.io/en/latest/faq.html#how-can-i-set-required-environment-variables

uturuncoglu commented 4 years ago

Sorry, I mean quick start guide

https://ufs-mrweather-app.readthedocs.io/en/latest/quickstart.html

There is a "Important" note there.

climbfuji commented 4 years ago

Thanks very much for the info. This is crazy. How can we expect users to do and know all this? And the support folks watching the UFS forums. There should be really only one way to do things on all platforms. Wishful thinking. I know it's not at your end, but this is concerning. I'll close the issue and stop worrying about Stampede, because I do not want to pollute my clean standard environment.