Open sadielbartholomew opened 4 years ago
(Potentially this is an Issue that is better placed under the Rose Issue Tracker, but since the paper authors make comments about Cylc specifically I have raised it here. Feel free to transfer it over to Rose if preferred.)
Feel free to "at" me to ask follow ups. I see you mention Rose, which as I understand it is a layer on top of Cylc? We never considered using that TBH, since we had passed over Cylc as too heavyweight for our planned use.
Hi @sadielbartholomew - thanks for flagging this.
Cylc has built-in support for all the HPC batch systems, and it allows you to set batch system directives - including those for controlling how MPI jobs are resourced and executed - as part of the workflow configuration. We run very large numbers of multi-node MPI jobs on HPC like this, all day every day. So I'm not sure I understand what the authors of those papers are getting at - maybe @rupertnash can comment?
It's true that Cylc doesn't "understand" what the individual batch system directives mean (except for execution time limit), if that's what this is about?, or vary them automatically in some way as the workflow proceeds. We have some plans to make it easier to alter directives on the fly, but generally our tightly coupled MPI models tend to be fixed in size, and consistent execution time is desirable (for a particular model in particular workflow at least).
Note I was partly responding to the sentence prior to your quote, in the paper:
Numerous domain specific workflow systems have been developed which, to some extent, support execution over HPC machines. One example of this is the weather and climate community,...
"to some extent"? ... almost all we do is HPC. (However, Cylc is not domain specific either, to atmospheric modeling or HPC; it is a generic tool with HPC batch system support and unique cycling workflow capability).
Hi @hjoliver- I can see why that sentence perhaps riled you a little - apologies! It was not meant to suggest that Cylc specifically doesn't do HPC properly: I'm aware that it get used a great deal, including for jobs on ARCHER (the large Cray we run). Perhaps "...to greater or lesser extents, support execution ..." would have been better. While I accept that Cylc is technically not domain specific, all the documentation and the case studies in your CiSE paper are from environment science, so it does seem "domain specific" from outside.
Regarding the batch systems, that resourcing/provisioning step isn't what either paper's about (although the Gordon Gibb first author one on the custom WMS touches on this). The reasoning behind the text @sadielbartholomew quotes above is that (as far as I can tell from the docs/tutorial) if I want to run an MPI application with Cylc then I have to put the appropriate bash command line in the task's "script" key in the suite file? And if I want my suite to work across multiple machines then I need to do some work to abstract over that?
@rupertnash - thanks for responding and sorry if I sounded a bit riled, not really intentional - I know it is an impossible job to get a complete understanding of all the extant workflow systems! We are very much aware that Cylc appears to be domain-specific even if technically it isn't. That is mostly for "historical reasons", and our own fault, and we're trying to rectify it in the upcoming massive Cylc 8 release.
if I want to run an MPI application with Cylc then I have to put the appropriate bash command line in the task's "script" key in the suite file?
Yes you're more or less right about that, although most Cylc users probably aren't aware of it because their large models are a) invariably launched by a non-trivial script (that comes with the model) to set up all the inputs before executing the model; and b) more or less have to run with MPI. So the model-run script contains the mpi-run command, and the task definition in the Cylc workflow config will typically just have script = run-model.py
(say) along with model configuration and batch system directives (including for MPI) appropriate to the model size and domain decomposition etc.
But I guess if your use cases are commonly run both with and without MPI, this would be exposed in the Cylc workflow config, and it might seem inconveniently low-level compared to just telling the workflow engine to "use MPI" for the task. (Still, it seems rather a small thing to me, compared to everything else that a workflow engine has to deal with ... but I don't know your use cases).
That makes a lot of sense, especially if your community norms include run scripts like these. My perspective suggests most of this should be either done by your WMS or a (serial) pre-processing step run before the main simulation (by your WMS). Happy to chat further! (Or if you are attending the virtual SC20 workshops, get some questions after my/Gordon's talks)
Thanks @rupertnash ; I attended SC in person last year, but unfortunately I can't attend even virtually this year :confounded: I will start by taking a closer look at the papers linked above...
Describe exactly what you would like to see in an upcoming release
Hi all, I thought I should draw attention to some recent work outlined in the pre-print Supercomputing with MPI meets the Common Workflow Language standards: an experience report (R. W. Nash et al.), closely related to another pre-print, A Bespoke Workflow Management System for Data-Driven Urgent HPC (G. P. S. Gibb et al.), both as summarised nicely in an EPCC blog post. The authors express that they have reviewed existing workflow management systems and found them lacking in terms of execution of MPI-parallelised applications.
In the first paper, the authors give a review of some mention Cylc and specifically say (see under 'II. RELATED WORK', noting that I have reported the mis-spelling in the name!):
Do we agree with this statement? My initial thoughts are that their needs may indeed have been possible already with use of Rose and Cylc together, with such commands as
rose mpi-launch
? Do we have capability that the authors might not have known about that could have helped them?I am going to quote this Issue to the authors to ask for clarification and indeed whether they considered Rose-Cylc. I think it would be good to raise here on GitHub to potentially start a discussion, as clearly the authors found Cylc (along with all other workflow engines surveyed) lacking and ideally we can develop Cylc so that such endeavours are possible in future with Cylc workflow(s).
So whilst I don't yet know the specifics for what could be improved for a future release, in this respect, I hope to find out. The papers linked above which I have skim read provide further detail but I want to understand what (Rose-)Cylc might be able to do towards running MPI applications in parallel that they/it can't do already.
Pull requests welcome!