Open dcherian opened 3 years ago
Did they suggest an alternative solution? I don't know of any other mechanism to determine if your node is in the Casper cluster or the Cheyenne cluster.
I didn't ask them. I thought it would be better for xdev to open up a new conversation rather than extending the scope of that ticket.
Did they suggest an alternative solution? I don't know of any other mechanism to determine if your node is in the Casper cluster or the Cheyenne cluster.
Ccing @jbaksta
Why not just explicitly state which resource you're targeting as part of a job submission process? Is there a reason to tie you to a piece of hardware so to speak rather than just set an environment variable that says you submitted to Casper or Cheyenne? Basically, why inspect when you can be explicit on a submission? Hostnames are likely to be much more fluid; especially as we look at higher levels of enablement w/ Linux namespaces.
An alternative could be to inspect the $PBS_JOBID
. Usually the CSG modules loaded set a specific environment variable too because they use something like that for $PATH
building since we have shared application storage. At least with default modules on Cheyenne and Casper you'll have the two following set:
NCAR_HOST=cheyenne
NCAR_HOST=dav
Note that cross submission between clusters (new-ish PBS capability we're enabling), the environment may get reset during job submission, but loading the ncarenv
module gives you the above.
I landed on a casper node that was named crthc02.hpc.ucar.edu instead of crhtc02.hpc.ucar.edu which broke ncar_jobqueue's regex.
I emailed cislhelp and they fixed it but also suggested not using the FQDN...
Perhaps we should talk to them and figure out a better solution.