cylc / cylc-flow

Cylc: a workflow engine for cycling systems.
https://cylc.github.io
GNU General Public License v3.0
335 stars 94 forks source link

Inconsistent contents of *-start.cylc #4755

Open dpmatthews opened 2 years ago

dpmatthews commented 2 years ago

If I have this workflow:

[scheduling]
    cycling mode = integer
    initial cycle point = 1
    hold after cycle point = 2
    stop after cycle point = 3
    final cycle point = 4
    [[dependencies]]
        P1 = "foo[-P1] => foo"
[runtime]
    [[foo]]
        script = echo Hello

and then run it using this command:

cylc play --icp=11 --holdcp=12 --stopcp=13 --fcp=14 test

then flow-config/flow-processed.cylc contains:

[scheduling]
    cycling mode = integer
    initial cycle point = 1
    hold after cycle point = 2
    stop after cycle point = 3
    final cycle point = 4
    [[dependencies]]
        P1 = "foo[-P1] => foo"
[runtime]
    [[foo]]
        script = echo Hello

whereas flow-config/*-start.cylc contains:

# cylc-version: 8.0rc2.dev
[scheduling]
    cycling mode = integer
    initial cycle point = 11
    hold after cycle point = 2
    stop after cycle point = 3
    final cycle point = 4
    [[graph]]
        P1 = foo[-P1] => foo
[runtime]
    [[root]]
    [[foo]]
        script = echo Hello

Related to this is the fact that cylc config accepts --icp as an option but none of the other cylc point settings which seems rather strange.

hjoliver commented 2 years ago

(It's somewhat unfortunate that we ever went down this CLI config override track).

wxtim commented 2 years ago

Analysis

flow-config/*-start.cylc is written at ~line 1150 of scheduler.py.

flow-config/flow-processed.cylc is written at around line 10 in cylc/flow/parsec/fileparse.py having been processed by read_and_proc, which does receive datetime options from the CLI.

Possible solutions

MetRonnie commented 2 years ago

The difference between initial cycle point and the other cycle point settings in *-start.cylc would appear to be due to this line: https://github.com/cylc/cylc-flow/blob/09581beb94250374215bea925f2bec985f6c13c2/cylc/flow/config.py#L273-L276

This is done before the dense config is loaded, so it modifies the sparse config. The sparse config is what gets dumped to *-start.cylc. No other cycle point option is overridden for the sparse config, they are only overridden for the dense config,

MetRonnie commented 2 years ago

Related to this is the fact that cylc config accepts --icp as an option but none of the other cycle point settings which seems rather strange.

That seems like the larger problem actually. If you have R1/$ = foo and don't specify flow.cylc[scheduling]final cycle point, then you can't use cylc config or cylc validate on the workflow

oliver-sanders commented 2 years ago

The flow-processed shows the config after pre-processing, the other file shows the config after loading by Cylc. This loading can modify the config in other ways too e.g:

https://github.com/cylc/cylc-flow/blob/b22fccd14c4eca02bcd3e8fe2e1b1b9446f33064/cylc/flow/config.py#L530-L531

https://github.com/cylc/cylc-flow/blob/b22fccd14c4eca02bcd3e8fe2e1b1b9446f33064/cylc/flow/config.py#L631-L632

https://github.com/cylc/cylc-flow/blob/b22fccd14c4eca02bcd3e8fe2e1b1b9446f33064/cylc/flow/config.py#L600

https://github.com/cylc/cylc-flow/blob/b22fccd14c4eca02bcd3e8fe2e1b1b9446f33064/cylc/flow/config.py#L559

https://github.com/cylc/cylc-flow/blob/b22fccd14c4eca02bcd3e8fe2e1b1b9446f33064/cylc/flow/config.py#L282

https://github.com/cylc/cylc-flow/blob/b22fccd14c4eca02bcd3e8fe2e1b1b9446f33064/cylc/flow/config.py#L285

https://github.com/cylc/cylc-flow/blob/b22fccd14c4eca02bcd3e8fe2e1b1b9446f33064/cylc/flow/config.py#L2129

https://github.com/cylc/cylc-flow/blob/b22fccd14c4eca02bcd3e8fe2e1b1b9446f33064/cylc/flow/config.py#L2246-L2248

https://github.com/cylc/cylc-flow/blob/b22fccd14c4eca02bcd3e8fe2e1b1b9446f33064/cylc/flow/config.py#L1267-L1271

https://github.com/cylc/cylc-flow/blob/b22fccd14c4eca02bcd3e8fe2e1b1b9446f33064/cylc/flow/config.py#L1290

https://github.com/cylc/cylc-flow/blob/b22fccd14c4eca02bcd3e8fe2e1b1b9446f33064/cylc/flow/config.py#L1146-L1147

https://github.com/cylc/cylc-flow/blob/b22fccd14c4eca02bcd3e8fe2e1b1b9446f33064/cylc/flow/config.py#L948

So the processed config has always been inconsistent. Is this a problem?

oliver-sanders commented 1 month ago

Would dumping the CLI options in a comment at the top of the file suffice?

hjoliver commented 3 weeks ago

That would be an improvement. Maybe also a comment along the lines of:

# note: command line options can override flow.cylc settings