This issue is leading RP to hang forever while occupying the resources for nothing.
Traceback (most recent call last):
File "/cache/home/afa64/ve/facts_3.9/lib/python3.9/site-packages/radical/pilot/utils/component.py", line 250, in _work_loop
self._initialize()
File "/cache/home/afa64/ve/facts_3.9/lib/python3.9/site-packages/radical/pilot/utils/component.py", line 545, in _initialize
self.initialize()
File "/cache/home/afa64/ve/facts_3.9/lib/python3.9/site-packages/radical/pilot/agent/executing/popen.py", line 63, in initialize
AgentExecutingComponent.initialize(self)
File "/cache/home/afa64/ve/facts_3.9/lib/python3.9/site-packages/radical/pilot/agent/executing/base.py", line 77, in initialize
self._rm = rpa.ResourceManager.create(rm_name,
File "/cache/home/afa64/ve/facts_3.9/lib/python3.9/site-packages/radical/pilot/agent/resource_manager/base.py", line 400, in create
return impl[name](cfg, rcfg, log, prof)
File "/cache/home/afa64/ve/facts_3.9/lib/python3.9/site-packages/radical/pilot/agent/resource_manager/base.py", line 150, in __init__
rm_info = self.init_from_scratch()
File "/cache/home/afa64/ve/facts_3.9/lib/python3.9/site-packages/radical/pilot/agent/resource_manager/base.py", line 269, in init_from_scratch
assert alloc_nodes >= rm_info.requested_nodes
AssertionError
For example, in CI tests, this will definitely lead to the entire test hanging until the timeout is reached to fetch the logs and debug the error.
This issue is leading RP to hang forever while occupying the resources for nothing.
For example, in CI tests, this will definitely lead to the entire test hanging until the timeout is reached to fetch the logs and debug the error.