cylc / cylc-flow

Cylc: a workflow engine for cycling systems.
https://cylc.github.io
GNU General Public License v3.0
335 stars 94 forks source link

install: configure remote platform symlink dirs per workflow #5418

Open dpmatthews opened 1 year ago

dpmatthews commented 1 year ago

At present, if you want different workflows to use different symlink setups on a remote platform the only way to achieve this is to create separate platforms which refer to different install targets. Some users would like the ability to configure the symlinks on a per workflow basis. This used to be possible with Cylc 7 via the Rose "root-dir" setting.

dpmatthews commented 1 year ago

See https://cylc.discourse.group/t/cylc8-symlink-dirs-for-remote-host

cpelley commented 1 year ago

Thanks for raising this. I think a solution that isn't tied to user configuration but instead via the workflow itself would be ideal (assuming I have understood). Our cylc7 suite utilises optional config keys so run for different purposes. Some configurations would utilise the root-dir targets in different ways.

hjoliver commented 1 year ago

Note my post on the above discourse topic - we actually need workflow-specific symlink dirs in general, not just on remote hosts.

However, it is fundamentally uncool for the workflow itself to specify where it should be installed (or symlinked) to. The workflow is what gets installed; where to, should be determined externally somehow.

I think it makes sense to specify where individual workflows should go, if necessary, in the (user's) global config file.

cpelley commented 1 year ago

...uncool for the workflow itself to specify where it should be installed (or symlinked) to...

I suspect you thought I meant wanting continued support of root-dir=*=<somedir>? - which I do not.

I have perhaps not explained very well. Our usecase isn't for the suite to decide where it is itself installed. By "root-dir targets", I mean the 'root-dir' mechanism by which target directories were specified in cylc7.

In particular, I mean specifying work, share, share/cycle and log:

root-dir{share/cycle}=*=$SCRATCH
root-dir{share}=*=$SCRATCH
root-dir{work}=*=$SCRATCH

Sometimes (like in our suite), running the suite for say trials (trial opt key) would mean generating huge quantities of data. This isn't a characteristic of who runs it but of running 'trial' (opt key) for this particular workflow. That is, I think there is a place for the user global config file in this for sure, but I think that could be to override what is specified in the workflow itself (if defined), share/cycle, share, work, log etc.

hjoliver commented 1 year ago

That is, I think there is a place for the user global config file in this for sure, but I think that could be to override what is specified in the workflow itself (if defined), share/cycle, share, work, log etc.

By "in the workflow itself" I think you mean rose-suite.conf with Cylc 7? From a Cylc perspective, rose suite-run was "external" although the Rose config file was stored with the workflow source.

In Cylc 8, that functionality is handled by cylc install which is configured via global.cylc.

[UPDATE:] with the proviso that remote (job platform) symlinks are created at run time, not install time, when the remote gets iniitialized.

oliver-sanders commented 1 year ago

The biggest problem with this is where to define it.

We don't want to store this in the workflow configuration (flow.cylc or suite.rc) because this is an installation option so would require us to load the workflow configuration at install time which we wouldn't want to do.

Options:

  1. Only support this on the CLI (current behaviour).
  2. Allow the global config to configure this for pre-defined source dir path patterns.
  3. Invent a new sidecar file e.g. install.cylc / cylc-install.toml / pyproject.toml / flow.py.
  4. Tie it in with Rose and use the rose-suite.conf file.

Options 3-4 may require a new pre-install plugin type.

Is there a desire to also configure the remote installation symlink targets in this way or would this just be for local installation?

SGallagherMet commented 1 year ago

I have a similar use case to @cpelley. I have an hourly cycling workflow that makes large amounts of transient data (kept on disk for up to 24 hours).

Option 1 is an OK workaround during the cylc7 to cylc8 upgrade, but long term it runs the risk of using up the disk quota by accidentally omitting the command line option.

Option 3 or 4 would be my preference as it seems a lot more explicit than option 2. However it's done, it would ideally be something that can be overridden at different sites for portable workflows.

With regards to remote installation symlink targets, I don't have a case right now but I imagine any platform could have the same disk usage/quota issues.

dpmatthews commented 1 year ago

Is there a desire to also configure the remote installation symlink targets in this way or would this just be for local installation?

This issue was specifically about remote installation but a solution covering localhost as well would be preferable.

oliver-sanders commented 1 year ago

Note, remote installation could be configured in the workflow config because it happens at runtime after the config has been processed (we already have the .cylcignore file (a sidecar file) for configuring local installation, but the [scheduler][install] section for configuring remote installation), however, it might make more sense to co-locate these

hjoliver commented 1 year ago

Option 3 or 4 would be my preference as it seems a lot more explicit than option 2. However it's done, it would ideally be something that can be overridden at different sites for portable workflows.

I think option 2 (by user global config) is the right way to do it.

Finally (3 and 4) keeping installation configuration in the workflow source directory (even in a special file) is fundamentally kinda wrong

SGallagherMet commented 1 year ago
  • it could be done centrally (site global config) for workflows that adhere to name/path conventions (but that isn't enough)

  • this is very similar, in principle, to platforms config, but finer-grained. And that's global config.

I realise how rose suites are being version controlled in the future is up for discussion, but under the current working practices where workflows are continually copied and renamed I can see this being problematic. Eventually someone will choose a name that will break the conventions.

Something equivalent to 'platforms', where the workflow selects from a set of centrally configured options, sounds like a good way to go.

hjoliver commented 1 year ago

. Eventually someone will choose a name that will break the conventions.

I agree, but that's what the user global config is for - it allows you to define your own conventions (even down to individual workflows) if you want. I was just pointing out that central config is possible, if users need or want to conform to whatever conventions that imposes (but that doesn't preclude use of user global config as well)

hjoliver commented 1 year ago

I realise how rose suites are being version controlled in the future is up for discussion,

Note that Cylc is also used at sites that don't have Rose.

However, we are retaining rose-suite.conf support in Cylc via a plugin, so we could potentially allow additional installation config in that as well, since technically that file already amounts to keeping installation config in the source directory (which I'm arguing we should move away from, more generally).

oliver-sanders commented 1 year ago

Something equivalent to 'platforms', where the workflow selects from a set of centrally configured options, sounds like a good way to go.

This wouldn't be portable between sites.

keeping installation configuration in the workflow source directory (even in a special file) is fundamentally kinda wrong

Agreed! Configuring installation options in the workflow is just wrong which is why it was purposefully dropped from Cylc 8 (this was not an accidental omission)! This is a user-specific installation option used to work around site-configured filesystem limits, it is not a property of a workflow and is not portable between sites. Even at one site one user might not need or want to install in the same way as another.

Because this is working around a user's filesystem allocation we considered it a user configuration problem. I.E. if you don't have enough allocation to run workflows, then configure symlink dirs [for all of your workflows] to a filesystem where you do have enough space.

Eventually someone will choose a name that will break the conventions.

I realise how rose suites are being version controlled in the future is up for discussion

The option (2) mentioned above isn't necessarily related to Rosie, version control or even workflow names, but the way that users manage their working copies. E.G. we could choose to work like this:

~/cylc-src/
  project-a/
    workflow-1/
      .svn
    workflow-2/
      .git
  project-b/
    workflow-3/
       .git

Or even like this:

# ~/.cylc/flow/global.cylc
[intstall]
  sources = ~/project-a, ~/project-b, ~/roses, ~/cylc-src

~/project-a
  workflow-1/
    .svn
  workflow-2/
    .git
~/project-b/
   workflow-3/
     .git

So option (2) could look something like this:

[install]
  [[sources]]
    [[[~/project-a]]]
      [[[[symlink dirs]]]]  # override site-defaults just for this project
        work = /big/volume
SGallagherMet commented 1 year ago

Conceptually I see the argument for wanting to keep the workflow installation settings separate, but as someone who develops/maintains workflows for other users I think it will just cause problems.

Really I want my instructions to my users to be.

  1. checkout workflow A from repository B
  2. use rose edit to configure the suite following the instructions in the metadata
  3. cylc vip . (Obviously in practice it's never quite that simple)

If I know there's a problem with running the workflow using the default configuration at my site I want to be able to handle that for the user automatically. The more instructions I have to give about things that need configuring outside the suite or about following conventions, the more opportunities there are for something to go wrong.

dpmatthews commented 1 year ago

cylc/cylc-rose#237 proposes to allow environment variables defined in rose-suite.conf to influence the global config. This should provide a solution to workflow specific symlink dirs for some users.

(The plan is to also support an alternative solution based on workflow name)

hjoliver commented 1 year ago

A bit of a recap.

Primarily:

This will probably be sufficient for existing sites that used rose suite-run (and run dir symlinks) with Cylc 7.

However, for completeness, we may want to support or document other solutions as well, because:

Other ideas:

  1. do workflow-specific symlinking with another plugin (c.f. cylc-rose) that looks at a more cylc-y file, e.g. symlink.cylc that doesn't come with other baggage (and then, again, Jinja2 in global config).
  2. use global config Jinja2 logic to check the workflow name against some project-specific naming convention, and automatically symlink accordingly. [And note users can override or add to site config with their own global.cylc]
  3. or (probably better) use a pre-install plugin to do the same

Note 2. and 3. the naming convention could be based on source workflow name or parent directory name, or install-dir name.


My first cut at 2.:

from cylc.flow.scripts.install import get_option_parser as install_opt_parser
from cylc.flow.scheduler_cli import get_option_parser as play_opt_parser
import sys

def get_workflow_name():
    """Parse a command line like 'cylc install' or 'cylc play'."""
    if sys.argv[1] == 'install':
        opts, args = install_opt_parser().parse_args(sys.argv[2:])
        return opts.workflow_name or args[0]
    if sys.argv[1] == 'play':
        opts, args = play_opt_parser().parse_args(sys.argv[2:])
        return args[0]
    else:
        return "dunno"

and in global.cylc:

#!Jinja2

{% from "get_workflow_name_cli" import get_workflow_name %}
{% set NAME = get_workflow_name() %}

[install]
    [[symlink dirs]]
        [[[localhost]]]
{% if NAME.startswith("proj_a") %}
            run = /tmp/ProjectA/$USER
{% elif NAME.startswith("proj_b") %}
            run = /tmp/ProjectB/$USER
{% else %}
            run = /tmp/$USER/
{% endif %}

And for 3. (via @oliver-sanders ):

# setup.cfg
[entry points]
cylc.flow.pre_install = main:main
# main.py
import os

def main(path, ...):
    if path.relative_to('~/cylc-run'):
        return
    os.environ['CYLC_PROJECT'] = path.parent
# global.cylc

{% from "os" import environ %}
{% if os.environ.get('CYLC_PROJECT') == 'foo' %}
    # ...
{% endif %}

NB:

the installation plugins can access the workflow ID derived from the --workflow-name.

ColemanTom commented 1 year ago

What if my projects workflows can use one of multiple disks depending on my mood? We would arbitrarly have to add some specifier which otherwise isn't important? This is an actual case where one project I work on has space on two different lustre disks, and I may use one or the other depending on how much space is available on each (or how heavily loaded that disk is going to be when I'm running an experiment).

ColemanTom commented 1 year ago

Could you make a system wide configuration which can specify variables which if set, will be passed through on setup? Then systems admins can say we will accept variables "A, B, C" and those will be used to determine other items in the global.cylc jinja2?

hjoliver commented 1 year ago

What if my projects workflows can use one of multiple disks depending on my mood?

The rose-suite.conf (or equivalent symlink.cylc) solution handles that, because it is workflow specific.

And so do the other solutions, because users can add to or override site config with their own global.cylc and so use their own naming conventions (right down to invdividual workflows if necessary) to determine symlinking.

kaday commented 1 year ago

I have a similar issue, we have a workflow, at cylc7 the directories were sym linked in the optional rose configs, because different options require different data retentions and volumes and different "teams" have different requirements.

How would this be handled via a global.cylc ? I can see the global.cylc getting very complicated and we would need to "pass" these around with our workflows when we provide them to the production teams.

I understand the desire to get rid of the information from the rose configs but currently do not see a solution that will not result in productions teams and even my team having a complicated global.cylc and will include the need for -O specific settings. I may have misread Hilary's post above in the "cut for " as it looks like it could be there (I need to go read the underling code) but I am not sure how it is then applied in the global.cylc.

ScottWales commented 1 year ago

Can't we use a pre-configure plugin to affect global.cylc? Or does this break with remote hosts. E.g.

def pre_configure(srcdir: Path=None, opts: optparse.Values=None, rundir: Path=None) -> T.Dict:
    """
    Reads file srcdir/configvars.toml and adds its contents to the Jinja2 environment
    for both suite and global configurations under namespace 'configvars'
    """
    with open(srcdir / 'configvars.toml', 'rb') as f:
        config = tomllib.load(f)

    return {
            'template_variables': {'configvars': config},
            'templating_detected': 'jinja2'
    }

with global.cylc as

#!jinja2
[install]
    [[symlink dirs]]
        [[[localhost]]]
            share = /scratch/$PROJECT/$USER/{{configvars.foo}}

and configvars.toml as

foo = "bar"

It doesn't look like cylc-rose is using the template_vars value in its pre-configure plugin at the moment

dpmatthews commented 1 year ago

How would this be handled via a global.cylc ? I can see the global.cylc getting very complicated and we would need to "pass" these around with our workflows when we provide them to the production teams.

The aim would be to agree a set of options (configurable via environment variables) that meet the needs of users at our site. These would be configured centrally.

ScottWales commented 1 year ago

Can't we use a pre-configure plugin to affect global.cylc? Or does this break with remote hosts.

Actually this doesn't work - it requires the config file to be in the same directory as global.cylc, as the plugin only knows the path to the file currently being parsed.

Another idea would be to add custom environment variables to the list sent in SSH commands, like CYLC_VERSION currently is:

[platforms]
    [[localhost]]
        ssh forward environment variables = PROJECT, LUSTRE_DISK

resulting in commands like

ssh -oBatchMode=yes -oConnectTimeout=10 -n gadi-login-01 \
        env CYLC_VERSION=8.3.0.dev PROJECT=dp9 LUSTRE_DISK="/g/data1" bash --login -c \
        'exec "$0" "$@"' /scratch/hc46/saw562/conda-dev/bin/cylc \
        play --debug test-global/run2 --host=localhost

The server then has the variables available when setting up symlinks.

This seems like a minor change - just adding a new config option to be expanded in construct_ssh_command()

hjoliver commented 1 year ago

Another idea would be to add custom environment variables to the list sent in SSH commands, like CYLC_VERSION currently is:

Yeah I was considering this myself recently for other reasons. It could be a good idea.

oliver-sanders commented 3 months ago

https://github.com/cylc/cylc-rose/issues/237 proposes to allow environment variables defined in rose-suite.conf to influence the global config. This should provide a solution to workflow specific symlink dirs for some users.

-- https://github.com/cylc/cylc-flow/issues/5418#issuecomment-1639860905

This functionality was released with cylc-rose version 1.4.0. You can use the [env] section in the rose-suite.conf file to set environment variables which will be made available to the global.cylc file when loaded.

See https://github.com/cylc/cylc-rose/issues/237 for more details.