metomi / rose

:rose: Rose is a toolkit for writing, editing and running application configurations.
https://metomi.github.io/rose/
GNU General Public License v3.0
55 stars 53 forks source link

Test that Cylc Rose workflow config retrieval respect compat mode #2779

Closed wxtim closed 4 months ago

wxtim commented 5 months ago

Tests for the combined effect of https://github.com/cylc/cylc-rose/pull/322 and https://github.com/cylc/cylc-flow/pull/6097 in closing https://github.com/cylc/cylc-rose/issues/319.

oliver-sanders commented 5 months ago

Re-tested against PRs merged into master branches, all good.

oliver-sanders commented 5 months ago

Small problem, although this test works as expected locally, the CI tests have passed before the upstream PRs were merged.

Presumably these tests are skipped in CI?

... Yes:

./t/rose-task-run/45-app-fcm-make-compat_mode.t .................... skipped: "[t]job-host" not defined or not available
wxtim commented 5 months ago

Small problem, although this test works as expected locally, the CI tests have passed before the upstream PRs were merged.

Presumably these tests are skipped in CI?

... Yes:

./t/rose-task-run/45-app-fcm-make-compat_mode.t .................... skipped: "[t]job-host" not defined or not available

Yes - they rely on having a remote platform setup.

oliver-sanders commented 5 months ago

I don't think that's right, fcm_make must load the workflow config in order to check whether there is a corresponding fcm_make2 task. If fcm_make2 is present, we would probably expect fcm_make to be local (it's pushing local data).

wxtim commented 5 months ago

I don't think that's right, fcm_make must load the workflow config in order to check whether there is a corresponding fcm_make2 task. If fcm_make2 is present, we would probably expect fcm_make to be local (it's pushing local data).

Yes, I've fixed the test. The problem was caused by my installing a PoC Yaml Config plugin which recent api changes in Cylc have broken.

oliver-sanders commented 5 months ago

(tests should pass once upstream PRs are merged)

oliver-sanders commented 5 months ago

What error are you getting (check the job.err file)?

oliver-sanders commented 5 months ago

Upstream PRs are in, so I would expect the tests to pass, but they have failed again, @wxtim could you take a look.

wxtim commented 5 months ago

Upstream PRs are in, so I would expect the tests to pass, but they have failed again, @wxtim could you take a look.

I don't think that's right, fcm_make must load the workflow config in order to check whether there is a corresponding fcm_make2 task. If fcm_make2 is present, we would probably expect fcm_make to be local (it's pushing local data).

FCM make is local, but apparently needs to be convinced that make 2 is not, even if make 2 is never called: As it was the fcm make 1 task was failing either side of the change.

Can demo that this works locally be commenting and uncommenting force_compat_mode=get_compat_mode(get_workflow_run_dir(workflow_id)) at around line 65 in cylc/rose/platform_utils.py

wxtim commented 5 months ago

I've changed this test: It now checks that the error logged by the fcm_make task refers to the platform (for fcm_make2) being garbage. Without the fix you instead get an error referring to the graph being wrong (which it is, unless you are in compat mode).

I've pruned away some deadwood caused by the test being a modified copy of t/rose-task-run/26-app-fcm-make-new-mode-with-cont which was obsfuscating some issues in review.

The code which was broken was the code where an fcm_make task looks for the platform where the fcm_make2 continuation task will be run. It doesn't matter whether we have that task in the graph, we just need it in the runtime for the rose to conclude it's a 2 part task and attempt to use the broken routine in Cylc Rose to look up the continuation platform.

oliver-sanders commented 4 months ago

Mac OS failures unrelated (latest run on master)