Closed eirrgang closed 3 weeks ago
We consider the rm_info
struct to be an implementation detail and as such not well specified. In fact, different resource managers (as in radical.pilot.agent.resource_manager.*
may provide different structures for rm_info altogether. The purpose of that structure is to communicate essential information between the RM and the agent executor and scheduler. We expose rm_info
as convenience for some use cases.
If that's ok for you, I would like to rephrase the feature request as: provide a stable and documented interface to inspect the pilot for the exact amount of resources it has available
. Would that be acceptable?
closed by #3117
The schema for
Pilot.resource_details
is not documented in https://radicalpilot.readthedocs.io/en/stable/apidoc.html#radical.pilot.Pilot, andPilot.resource_details["rm_info"]
is not described at all.However,
Pilot.resource_details["rm_info"]
is the only way I know to find out how many nodes and cores were allocated to fulfill a PilotDescription.I would like some assurance of when or how the structure may change in the future, or advice on other ways to get information. For example, if I want to know the numbers of cores allocated, how do I know whether it is better to check the value of
Pilot.resource_details["rm_info"]["requested_cpus"]
or to count the total number of cores insum(len(node["cores"]) for node in Pilot.resource_details["rm_info"]["node_list"])
, or which approach may be more stable in the long run?