Closed keziming closed 1 month ago
@keziming hey, are you using Chryslias
. I'm going through LCRC support emails, and they only indicated imporv
, bebop
and swing
are switching to PBS. I don't think Chrysalis
is one of those impacted..
@keziming, within the file:
/lcrc/group/e3sm/ac.zke/E3SMv3_dev/20231110.uci-linoz3.1870-2014.09142022branch.t0.master.v2_like.F20TR.chrysalis/post/scripts/ts_atm_daily_180x360_aave_1870-1874-0005.bash
I am seeing:
#!/bin/bash
# Running on anvil
#SBATCH --job-name=ts_atm_daily_180x360_aave_1870-1874-0005
#SBATCH --account=condo
#SBATCH --nodes=1
#SBATCH --output=/lcrc/group/e3sm/ac.zke/E3SMv3_dev/20231110.uci-linoz3.1870-2014.09142022branch.t0.master.v2_like.F20TR.chrysalis/post/scripts/ts_atm_daily_180x360_aave_1870-1874-0005.o%j
#SBATCH --exclusive
#SBATCH --time=0:10:00
#SBATCH --partition=compute
Are you trying to run on Anvil? If so, these line in your config file are not appropriate:
partition = compute
environment_commands = "source /lcrc/soft/climate/e3sm-unified/load_latest_e3sm_unified_chrysalis.sh"
They are for Chrysalis.
If you are trying to run on Chrysalis, the question is why the Anvil template is being used for the job script.
LCRC changed job submission from Slurm to PBS.
@chengzhuzhang is correct that this is not related. Neither Anvil nor Chrysalis has switched to PBS.
Yes it's a Slurm error you're encountering: sbatch: error: Batch job submission failed: Invalid account or account/partition combination specified\n'
. If it wasn't even using SLURM, I wouldn't expect to see this error.
It looks like this was run on Anvil.
$ grep -n machine /lcrc/group/e3sm/ac.zke/E3SMv3_dev/20231110.uci-linoz3.1870-2014.09142022branch.t0.master.v2_like.F20TR.chrysalis/post/scripts/ts_atm_daily_180x360_aave_1870-1874-0005.settings
25: 'machine': 'anvil',
I did try running on Chrysalis with
zppy -c /home/ac.zke/E3SM_diag/post.v2.chemUCI.LR.amip_0101.1870-2014.cfg
All jobs are submitted, but saw errors in ts_atm_daily_180x360_aave
tasks.
@keziming hey, are you using
Chryslias
. I'm going through LCRC support emails, and they only indicatedimporv
,bebop
andswing
are switching to PBS. I don't thinkChrysalis
is one of those impacted..
@chengzhuzhang I login at chrlogin1 on LCRC
@keziming, is it possible that you accidentally sourced the E3SM-Unified load script for Anvil, not Chrysalis? That would identify the machine to zppy as being Anvil.
It looks like this was run on Anvil.
$ grep -n machine /lcrc/group/e3sm/ac.zke/E3SMv3_dev/20231110.uci-linoz3.1870-2014.09142022branch.t0.master.v2_like.F20TR.chrysalis/post/scripts/ts_atm_daily_180x360_aave_1870-1874-0005.settings 25: 'machine': 'anvil',
@forsyth2 @xylar thanks for pointing out it. How should I set it to chrysalis, if you look at my *cfg file
@xylar @forsyth2 @chengzhuzhang I found my error. I use the wrong unified-e3sm source before I run zppy. Now, it works. Thanks a lot!
Okay if we close this?
Yes. Thank you all!
What happened?
LCRC changed job submission from Slurm to PBS. Please help me! Thank you in advance!
When I submit a job that worked yesterday, the error come out as
zppy -c ./post.v2.chemUCI.LR.amip_0101.1870-2014.cfg
Problem submitting script /lcrc/group/e3sm/ac.zke/E3SMv3_dev/20231110.uci-linoz3.1870-2014.09142022branch.t0.master.v2_like.F20TR.chrysalis/post/scripts/ts_atm_daily_180x360_aave_1870-1874-0005.bash sbatch --export=ALL /lcrc/group/e3sm/ac.zke/E3SMv3_dev/20231110.uci-linoz3.1870-2014.09142022branch.t0.master.v2_like.F20TR.chrysalis/post/scripts/ts_atm_daily_180x360_aave_1870-1874-0005.bash b'sbatch: error: Batch job submission failed: Invalid account or account/partition combination specified\n' Traceback (most recent call last): File "/lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.10.0_login/bin/zppy", line 10, in
sys.exit(main())
File "/lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.10.0_login/lib/python3.10/site-packages/zppy/main.py", line 193, in main
existing_bundles = ts(config, scriptDir, existing_bundles, job_ids_file)
File "/lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.10.0_login/lib/python3.10/site-packages/zppy/ts.py", line 113, in ts
submitScript(scriptFile, statusFile, export, job_ids_file)
File "/lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.10.0_login/lib/python3.10/site-packages/zppy/utils.py", line 242, in submitScript
raise RuntimeError(error_str)
RuntimeError: Problem submitting script /lcrc/group/e3sm/ac.zke/E3SMv3_dev/20231110.uci-linoz3.1870-2014.09142022branch.t0.master.v2_like.F20TR.chrysalis/post/scripts/ts_atm_daily_180x360_aave_1870-1874-0005.bash
You can look at my original code at /home/ac.zke/E3SM_diag/post.v2.chemUCI.LR.amip_0101.1870-2014.cfg
What did you expect to happen? Are there are possible answers you came across?
No response
Minimal Complete Verifiable Example (MVCE)
No response
Relevant log output
No response
Anything else we need to know?
No response
Environment
e3sm_unified_1.10.0_login