radical-cybertools / radical.saga

A Light-Weight Access Layer for Distributed Computing Infrastructure and Reference Implementation of the SAGA Python Language Bindings.
http://radical-cybertools.github.io/saga-python/
Other
83 stars 34 forks source link

Cleanup `n_nodes` for slurm_job #795

Closed mtitov closed 4 years ago

mtitov commented 4 years ago

Follow up Andre's comment at PR #794

How about:

n_nodes = None
if self._ppn:
   n_nodes = ...

if total_memory:
  assert(n_nodes)
  mem_per_node = total_memory / n_nodes

Those calculations are, at the end, really independent from the specific target resource, and should also work for resources we don't really check in the code.

@andre-merzky agree to re-work current implementation according to your changes and would include here discussion about self._ppn with different queues: I saw you've used self.rm.queue to set self._ppn (here), so self.rm "knows" about the queue? I thought it is only available at _job_run (here). I like your approach here as well

andre-merzky commented 4 years ago

Ouch, the self.rm.queue was totally wrong, thanks for catching that! I did a merge from an old location today and apparently did not think clearly when 'fixing' this line (the resource manager URL has no queue element). So you are right, we will have to delay those decisions until _job_run.