mej / nhc

LBNL Node Health Check
Other
226 stars 79 forks source link

Support for OpenPBS #136

Open xpillons opened 1 year ago

xpillons commented 1 year ago

OpenPBS and Torque have different outputs and options for qstat. Please add support for OpenPBS

mej commented 1 year ago

Well, I'm afraid I am not at all familiar with OpenPBS's command syntax or how it differs from TORQUE or PBS Pro, nor do I have a cluster running OpenPBS to test on. But NHC is fully Open Source, so you might be able to make the customizations yourself -- and if you do, we'd love for you to contribute your work to the project! (But you absolutely are not required to do so.) 😀

jbaksta commented 1 year ago

I've started swapping over to NHC from in-house and am looking to do PBS Pro as well. I've started small things, but there is enough difference between the two, might make sense to have NHC_RM=pbspro as the variant. Normally I wouldn't mind openpbs, but openpbs has a weird anthropology(?) in this case.

mej commented 1 year ago

It's interesting for me to hear that OpenPBS, PBS Pro, and TORQUE have diverged to such a degree -- having done a significant amount of work with TORQUE internals but never having even touched either of the other 2 🤣 -- but it's not really surprising either.

If it's impossible, impractical, or imprudent to try to treat them all as "similar enough," I think that's fine, and I'm perfectly happy having NHC_RM=pbspro and NHC_RM=openpbs be distinct from NHC_RM=pbs. It might be wise, in fact, to move toward NHC_RM=torque for the 1.5 release to avoid ambiguity and/or confusion going forward!

Is this something you'd be interested in contributing? If so, welcome, and feel free to ask questions, share works in progress, etc. along the way! 😁

jbaksta commented 1 year ago

I think we might be willing to push a pbspro variant at some point. We'll probably end up having to still override that as we have another tool that we put on top that does some more automation around that which we were thinking about using for doing the online/offline actions and I'm unsure what that tool will offer me as an API.

That being said, I think it would be valuable to have a pbspro basic integration available and try to minimize the harm that can be inflicted on pbspro server itself too.

xpillons commented 1 year ago

I agree that due to the divergence it would make sense to have pbspro and openpbs variant. The more challenging part will be the epilog or prolog scripts as these are not scripts configured from the config file, but rather python hooks that need to be specifically configured and written making the reuse of bash script less useful.

I may be able to contribute but more as a tester rather than a writer.