metomi / rose

:rose: Rose is a toolkit for writing, editing and running application configurations.
https://metomi.github.io/rose/
GNU General Public License v3.0
55 stars 53 forks source link

Git file install: support for file creation from Git repositories #2754

Closed benfitzpatrick closed 4 months ago

benfitzpatrick commented 9 months ago

Work in progress!

Aims to address #1419.

I'm not that happy with the configuration syntax, although I am happy with the three main elements of it as elements.

The mechanism itself uses filtering and either partial clones or proper sparse checkouts where the Git version allows.

wxtim commented 9 months ago

Let us know when you want us to have a review at this.

benfitzpatrick commented 9 months ago

Ready for review, not ready to go in IMO!

oliver-sanders commented 8 months ago

Ready for review, not ready to go in IMO!

@benfitzpatrick, what's left to do?

benfitzpatrick commented 8 months ago

I'd like to do a bit of in-anger user-case testing if we feel it's close enough as-is? Shouldn't take too long.

benfitzpatrick commented 7 months ago

I've done some testing against real cases I could find in our workflows, found and fixed some issues in afddb2b.

I tested a case with 6 GitHub file installs in a single rose-app.conf - we were worried about hitting a rate limit. It worked fine and looked like these happen sequentially, at least from the verbose output? Happy to demo offline.

Otherwise I'm happy

oliver-sanders commented 7 months ago

looked like these happen sequentially

Curious, I'll check that's because GitHub is serving the requests sequentially rather than Rose failing to run them in parallel.

benfitzpatrick commented 6 months ago

Little bump

oliver-sanders commented 6 months ago

looked like these happen sequentially

Curious, I'll check that's because GitHub is serving the requests sequentially rather than Rose failing to run them in parallel.

After a bit of digging I'm happy that these operations are being run concurrently, resulting in their subprocesses being run in parallel.

To track subprocesses, try the following diff:

diff --git a/metomi/rose/popen.py b/metomi/rose/popen.py
index 8993fb93..8ec00c94 100644
--- a/metomi/rose/popen.py
+++ b/metomi/rose/popen.py
@@ -372,6 +372,7 @@ class RosePopener:
                 proc = Popen(args[0], **kwargs)
             else:
                 command = ' '.join(map(shlex.quote, args))
+                print(f'# {command}')  # process started
                 proc = await asyncio.create_subprocess_shell(command, **kwargs)
         except OSError as exc:
             if exc.filename is None and args:
@@ -392,7 +393,9 @@ class RosePopener:
         if isinstance(kwargs.get("stdin"), str):
             stdin = kwargs.get("stdin")
         stdout, stderr = await proc.communicate(stdin)
+        command = ' '.join(map(shlex.quote, args))
         await proc.wait()
+        print(f'$ {command}')  # process returned
         return proc.returncode, stdout, stderr

     async def run_ok_async(self, *args, **kwargs):

Using this diff with the following config:

[command]
default=true

[file:cylc/scheduler.py]
source=git:git@github.com:cylc/cylc-flow.git::cylc/flow/scheduler.py::master

[file:cylc/task_proxy.py]
source=git:https://github.com/cylc/cylc-flow.git::cylc/flow/task_proxy.py::master

[file:cylc/async_util.py]
source=git:https://github.com/cylc/cylc-flow.git::cylc/flow/async_util.py::master

I get this output showing up to three subprocesses running simultaneously:

# git --git-dir=/var/tmp/tmpqqkj_tvk/.git init
# git --git-dir=/var/tmp/tmpd4c71k8x/.git init
# git --git-dir=/var/tmp/tmp50f2vgwd/.git init
$ git --git-dir=/var/tmp/tmpqqkj_tvk/.git init
# git --git-dir=/var/tmp/tmpqqkj_tvk/.git remote add origin https://github.com/cylc/cylc-flow.git
$ git --git-dir=/var/tmp/tmp50f2vgwd/.git init
# git --git-dir=/var/tmp/tmp50f2vgwd/.git remote add origin https://github.com/cylc/cylc-flow.git
$ git --git-dir=/var/tmp/tmpd4c71k8x/.git init
# git --git-dir=/var/tmp/tmpd4c71k8x/.git remote add origin git@github.com:cylc/cylc-flow.git
$ git --git-dir=/var/tmp/tmpqqkj_tvk/.git remote add origin https://github.com/cylc/cylc-flow.git
# git --git-dir=/var/tmp/tmpqqkj_tvk/.git fetch --depth=1 origin 4b70574c9100e1cf830a226c254cb3159a558873
$ git --git-dir=/var/tmp/tmp50f2vgwd/.git remote add origin https://github.com/cylc/cylc-flow.git
# git --git-dir=/var/tmp/tmp50f2vgwd/.git fetch --depth=1 origin 4b70574c9100e1cf830a226c254cb3159a558873
$ git --git-dir=/var/tmp/tmpd4c71k8x/.git remote add origin git@github.com:cylc/cylc-flow.git
# git --git-dir=/var/tmp/tmpd4c71k8x/.git fetch --depth=1 origin 4b70574c9100e1cf830a226c254cb3159a558873
$ git --git-dir=/var/tmp/tmpqqkj_tvk/.git fetch --depth=1 origin 4b70574c9100e1cf830a226c254cb3159a558873
# git --git-dir=/var/tmp/tmpqqkj_tvk/.git --work-tree=/var/tmp/tmpqqkj_tvk checkout 4b70574c9100e1cf830a226c254cb3159a558873
$ git --git-dir=/var/tmp/tmp50f2vgwd/.git fetch --depth=1 origin 4b70574c9100e1cf830a226c254cb3159a558873
# git --git-dir=/var/tmp/tmp50f2vgwd/.git --work-tree=/var/tmp/tmp50f2vgwd checkout 4b70574c9100e1cf830a226c254cb3159a558873
$ git --git-dir=/var/tmp/tmp50f2vgwd/.git --work-tree=/var/tmp/tmp50f2vgwd checkout 4b70574c9100e1cf830a226c254cb3159a558873
# rsync -a '--exclude=.*' --timeout=1800 '--rsh=ssh -oBatchMode=yes -oStrictHostKeyChecking=no -oConnectTimeout=8' /var/tmp/tmp50f2vgwd/cylc/flow/async_util.py /var/tmp/tmph9lc16hm/f13f5dacdb0f632826821bf9278a10fc
$ git --git-dir=/var/tmp/tmpqqkj_tvk/.git --work-tree=/var/tmp/tmpqqkj_tvk checkout 4b70574c9100e1cf830a226c254cb3159a558873
# rsync -a '--exclude=.*' --timeout=1800 '--rsh=ssh -oBatchMode=yes -oStrictHostKeyChecking=no -oConnectTimeout=8' /var/tmp/tmpqqkj_tvk/cylc/flow/task_proxy.py /var/tmp/tmph9lc16hm/759c658589e37a9af232bfa8e03d01b0
$ rsync -a '--exclude=.*' --timeout=1800 '--rsh=ssh -oBatchMode=yes -oStrictHostKeyChecking=no -oConnectTimeout=8' /var/tmp/tmpqqkj_tvk/cylc/flow/task_proxy.py /var/tmp/tmph9lc16hm/759c658589e37a9af232bfa8e03d01b0
$ rsync -a '--exclude=.*' --timeout=1800 '--rsh=ssh -oBatchMode=yes -oStrictHostKeyChecking=no -oConnectTimeout=8' /var/tmp/tmp50f2vgwd/cylc/flow/async_util.py /var/tmp/tmph9lc16hm/f13f5dacdb0f632826821bf9278a10fc
$ git --git-dir=/var/tmp/tmpd4c71k8x/.git fetch --depth=1 origin 4b70574c9100e1cf830a226c254cb3159a558873
# git --git-dir=/var/tmp/tmpd4c71k8x/.git --work-tree=/var/tmp/tmpd4c71k8x checkout 4b70574c9100e1cf830a226c254cb3159a558873
$ git --git-dir=/var/tmp/tmpd4c71k8x/.git --work-tree=/var/tmp/tmpd4c71k8x checkout 4b70574c9100e1cf830a226c254cb3159a558873
# rsync -a '--exclude=.*' --timeout=1800 '--rsh=ssh -oBatchMode=yes -oStrictHostKeyChecking=no -oConnectTimeout=8' /var/tmp/tmpd4c71k8x/cylc/flow/scheduler.py /var/tmp/tmph9lc16hm/8a367e85378d4334d5133fa065d71116
$ rsync -a '--exclude=.*' --timeout=1800 '--rsh=ssh -oBatchMode=yes -oStrictHostKeyChecking=no -oConnectTimeout=8' /var/tmp/tmpd4c71k8x/cylc/flow/scheduler.py /var/tmp/tmph9lc16hm/8a367e85378d4334d5133fa065d71116

You can also test it like this:

diff --git a/metomi/rose/popen.py b/metomi/rose/popen.py
index 8993fb93..35bdbe27 100644
--- a/metomi/rose/popen.py
+++ b/metomi/rose/popen.py
@@ -371,7 +371,8 @@ class RosePopener:
             if kwargs.get("shell"):
                 proc = Popen(args[0], **kwargs)
             else:
-                command = ' '.join(map(shlex.quote, args))
+                # command = ' '.join(map(shlex.quote, args))
+                command = 'sleep 5'
                 proc = await asyncio.create_subprocess_shell(command, **kwargs)
         except OSError as exc:
             if exc.filename is None and args:

The time taken should increase linearly as the sleep is increased, but only see a small increase (parsing overheads) when more files are configured for installation.

benfitzpatrick commented 6 months ago

Curious, I'll check that's because GitHub is serving the requests sequentially rather than Rose failing to run them in parallel.

After a bit of digging I'm happy that these operations are being run concurrently, resulting in their subprocesses being run in parallel.

Ah, I know why I thought they were sequential - the parsing is serial, the actual pulling can be concurrent. Sorry-thanks!

wxtim commented 5 months ago

I've opened #2784 - it looks like the bug I found is unrelated to this work.

oliver-sanders commented 4 months ago

Whilst working on #2785, I've noticed that "cannot connect to" type errors appear to surface with the following message:

ConfigProcessError: file:README.md=source=git:https@github.com:metomi/rose::README.md::master: ls-remote: could not find ref 'master' in 'git@github.com:metomi/rose'

Where the underlying cause here is that GitHub SSH access has not been configured.

benfitzpatrick commented 4 months ago

I could do a "or remote not contactable?" on the end of the error message?

oliver-sanders commented 4 months ago

If it's possible to differentiate between these cases, great, but if not that'll be fine.

oliver-sanders commented 4 months ago

Waiting for tests to pass again, then ready to merge.

oliver-sanders commented 4 months ago

Good news, the tests pass.

Bad news, if some of the tests skip, then the test fails, e.g:

diff --git a/t/rose-app-run/28-git.t b/t/rose-app-run/28-git.t
index 1b4eaaa90..92a411611 100644
--- a/t/rose-app-run/28-git.t
+++ b/t/rose-app-run/28-git.t
@@ -72,12 +72,12 @@ remote_locations=("$HOSTNAME:$TEST_DIR/hellorepo/" "http://localhost:$GIT_WS_POR
for i in 0 1 2; do
     remote_mode="${remote_test_modes[$i]}"
     remote="${remote_locations[$i]}"
-    if [[ "$remote_mode" == "ssh" ]] && ! ssh -n -q -oBatchMode=yes $HOSTNAME true 1>'/dev/null' 2>/dev/null; then
+    if true; then
         skip 14 "cannot ssh to localhost $HOSTNAME"
         echo "Skip $remote" >/dev/tty        
         continue
     fi
-    if [[ "$remote_mode" == "http" ]] && ! curl --head --silent --fail $remote >/dev/null 2>&1; then
+    if true; then
         skip 14 "failed to launch http on localhost"
         echo "Skip $remote" >/dev/tty        
         continue

:(

[FAIL] ../config: rose-app.conf not found.
Dubious, test returned 1 (wstat 256, 0x100)
Failed 15/57 subtests 
    (less 42 skipped subtests: 0 okay)
Test Summary Report
-------------------
t/rose-app-run/28-git.t (Wstat: 256 Tests: 42 Failed: 0)
  Non-zero exit status: 1
  Parse errors: Bad plan.  You planned 57 tests but ran 42.

Not a release blocker. Suggest bumping this to future work.

~Opened: https://github.com/metomi/rose/issues/2789~

oliver-sanders commented 4 months ago

@MetRonnie, could you check the last couple of commits and merge if you're happy with the tests being pushed to a bugfix release milestone.

oliver-sanders commented 4 months ago

Brill, that's fixed it (will close issue). Tested with all possible skip combinations, test passed each time.

oliver-sanders commented 4 months ago

(Mac OS tests failing due to gh-actions DNS issues, safe to ignore)

oliver-sanders commented 4 months ago

(the test fixes would have been on a bugfix release milestone, however, there is no need for a post-fix, Ben has sorted it on this PR)