pyinvoke / invoke

Pythonic task management & command execution.
http://pyinvoke.org
BSD 2-Clause "Simplified" License
4.41k stars 369 forks source link

Easy execution of tasks from within Python itself #170

Open bitprophet opened 10 years ago

bitprophet commented 10 years ago

Until now the focus has been mostly on CLI execution, but calling programmatically has always been on the table - couldn't find a good ticket so this must be it :) (#112 is sort of a feature request for this but it's more of a support thread, I'll roll it into here instead.)

Currently there are no great options for calling a task from another task:

Rambling thoughts about how to smooth over the above problems:

Brainstorm of use cases/scenarios:

sophacles commented 10 years ago

Not exactly sure if this is 100% related, but - I have a handful of task collections bundled by functionality. I prefer not to do inv -c ${long_path_here} --args.... So I created a setup.py script, with a bunch of entry point script directives: build, prod, dev, etc. which all point to the various function names in this file https://gist.github.com/sophacles/c17ce33a14d582dfa268.

It would be nice to have a way of doing this this without the hackery.

bitprophet commented 10 years ago

@sophacles Is there a reason the namespacing features don't cut it for that use case? The entire point of that API is to make it pretty easy to throw around Python modules/Collection objects, import them from wherever, etc; then make them into a task tree.

There's more exploration to do in that space yet (eg selecting a specific subtree at runtime - which would also help your use case if your problem is you don't want to expose the whole tree at once) but it works reasonably well already.

sophacles commented 10 years ago

Really it comes down to environment building. The exact gist i linked is slightly truncated. I have like 10 utilities built out of invoke, each of which has something like 5-20 targets (some overlapping at the deepest layers of the tree, but each utility has a specific cognitive task focus -- build, test, deploy generic, deploy specific environments, debug tools, demo setup, and so on).

I could probably train myself always to do:

inv -c /path/to/source/tree/utils/$TASK ....

But I kind of like having them:

  1. as a script available in my $PATH
  2. as tools named cognitively, just like when I'm using busybox... i just link (e.g) ls and cp rather than calling busybox ls ... or busybox cp file1 file2
  3. in a consistent place (depending on what I'm doing, the source tree where the task files live can vary. It's in one place on my dev machine, and different places in deployment depending on what OS, nfs details, and so on. We're still working out our "one true deployment strategy" so this changes as stuff comes up).

Also noteworthy, several of my targets are most commonly called from places in the system that are not the directory where invoke lives. For example it's nice having the build scripts called from the cwd when I'm debugging. Another example is the deploy scripts callable from wherever I happen to be in the fs when I figure out my error. (In the later case, I'm often developing my tasks in an editor on my mac, with the system I'm testing deploys on in a virtualbox that has the source tree mounted, or on a remote vm with the source tree mounted in both places via NFS).

bitprophet commented 10 years ago

Somehow I neglected to click your gist link before. Durr. I see what you're doing there.

I think that's closely related to but not 100% the same as this ticket, insofar as I was intending this to be "I have some stuff at the Python level and want to call a task as a function or similar" and yours is "I want to hijack or duplicate the normal parser -> load collection -> execute task(s), process".

Definitely agree that your use case should be supported (making the CLI guts reusable has been a huge weak spot in Fab 1 and I want to avoid that mistake here). Reckon we could make a spinoff ticket, link your gist, and then I can shit out some thoughts based on existing API and what needed changes might be. For now I'll just make sure it's a separate use case up top.

sophacles commented 10 years ago

OK - made a ticket for the funcitonality I was describing at #173

sophacles commented 10 years ago

As for the rest of this ticket - I know invoke is intended to be part of fab2, so perhaps (unless they exist and I missed them) and explanation of how it's intended for fab2 to utilize invoke would be a good starting point. That way there are practical considerations and actual use cases to generalize from. It certainly helps avoid "architecture astronauting". It also could help avoiding some bad cases of fab2 relying on invoke relying on fab2.

bitprophet commented 10 years ago

Yes, one of the reasons I have been poking Invoke lately is I'm attempting to get an actual Fab 2 ready for people to hammer on. And as you implied, every time I need to take one step forwards with Fab 2 work, it results in me needing to come back here and add or change stuff :) so it's a very good way to force this sort of reusability!

bitprophet commented 10 years ago

Been having another discussion with @sophacles on IRC about this (or at least, strongly related things).

His core use case in that discussion is (sanitized a tiny bit):

Originally, I was assuming the best approach here is the "pure" approach: if task A calls task B, task B gets a context 100% identical to if it had been called as a top level task. If task A wants to change B's behavior it should just use regular function arguments.

Erich's use case seems to be that he wants to call task B with a "hybrid" context containing elements from task A's namespace, so that the A namespace can change some of those default settings affecting B's behavior. I.e. say the B module defines uwsgi.root as "/", and the A module wants any calls it makes to tasks in B, to act like uwsgi.root was "/otherthing". In this setup, A would have {'uwsgi': {'root': '/otherthing'}} in its configuration dict, and its calls to B would merge that with B's defaults, overriding them.

I'm not entirely for or against this at the moment but it is not what I'd originally assumed to be the sensible default.

sophacles commented 10 years ago

I deleted my previous two comments as I was still not explaining it sufficiently for my brain to shut up about it. I think this actually explains it correctly:

First off, I view the context as similar to the OS's environment[1]. In an OS, it is up to the spawning process (the caller) to set up the environment in which the child process (callee) runs. In the OS, if the callee manipulates the environment, and in turn spawns another child (callee_1), then callee_1 is beholden to whatever manipulations callee set up, even if they differ from caller's settings.

Each child process should do something like this:

if [ -z "$VARNAME"] ; then
   export VARNAME="defaultval"
fi

By analogy, the context resulting from a task doing manipulation, is the context a called task (a callee) should hold. It will need to do (or it's logical equivalent):

if "varname" not in ctx:
    ctx.update({'varname':'defaultval'})

Second, is the role of Collection in all this. There is all the stuff about collections building containers, including setting container defaults. Using the OS analogy, I view it as similar to creating a wrapper script[2]:

# wrap all the things
if [ -z "$VAR1" ] ; then
    VAR1="default_val1"
fi

if [ -z "$VAR2" ] ; then
    VAR2="default_val2"
fi

exec $@

When called from the command line, inv collection_foo.task_bar works (conceptually) the same as sh wrapper.sh wrappee. That is to say, if there wasn't already context setting the relevant variables (by say a config file loaded by inv), then the defaults are put into the context.

None of this is of course the "controversial" stuff... wrappers in wrappers, or collections in collections, are the same, and everyone agrees that is good :).

So, going on to the thing discussed in IRC, and outlined quite well by @bitprophet above, I need to set up a couple definitions:

In code that is:

from some_other_taskfile import a_task, a_collection

# in this case, a_task is being called as  a bare task
@task
def call_bare(ctx):
    a_task(ctx)

# in this case a_task is being called as a collection task
@task
def call_collection(ctx):
    a_collection['a_task'](ctx)

In the OS/shell analogy, let's say we have wrapper.sh as defined above, and wrappee.sh defined as:

# This is trivial I know, but I didn't want to get lost in a complicated shell script
echo $VAR1 > $VAR2

If I just call sh wrapper.sh sh wrappee, I end up with a file called default_val2 and its contents are default_val1\n.

Now say I choose to build another tool, taking advantage of wrappee.sh. If consider wrappee the same as a bare task, I need to make a caller.sh like so:

VAR1='caller_text'

if [ -z "$VAR2" ] ; then
    VAR2 = "default_val2"
fi

exec sh wrappee.sh

If i don't define VAR2 in caller.sh, there is a shell syntax error, because the variable is undefined. This is expected, but generally bad programming, because there is a repeated default declaration, and it now needs to be updated in 2 different places, for different use cases of some core component. I'd much prefer to do this for caller.sh

VAR1='caller_text'

exec sh wrapper.sh sh wrappee.sh

and make use of the tools and defaults I've already built. This second caller.sh is analogous to a collection task in my mind.

N.B - just because my examples above are in shell, they are really operating system semantics

Finally: there is an argument that stuff above should just be kwargs to the task, however, I tend to look at my tasks as useful independent units (possibly unified in a collection via a combination of: grouping by concept a la directories, and a combination of sane defaults, ala a wrapper script). Just like with programs, there are some things that are useful as command line arguments, and some things that make sense as part of the environment (usually a decision about what child processes or groups of processes will do).

The tl;dr of this, is that:

collection_foo['task_bar'](context)

should act like:

$ VAR=DEFINITION sh wrapper_foo.sh sh script_bar.sh

[1] I am deriving this influence from the fact that invoke is fundamentally a way to write better shell scripts. Better being subjective of course, but the goal is to use a real programming language to handle all the stuff that is icky in shell - such as building parameter lists, path manipulation, having command arguments, and making decisions (if, loop, et al) - before actually spawning some other command.

[2] or using run-parts, or using functions or sourcing other scripts, but they work out very similarly

sophacles commented 10 years ago

In a similar vein, using terminology from above:

bitprophet commented 10 years ago

Some semi disjointed replies after rereading most of the above:

jhermann commented 9 years ago

NOT having read the long thread, I'm still throwing #223 in the ring. :smile:

frol commented 9 years ago

Just in case somebody needs a workaround to execute tasks from another tasks right now, I implemented a really simple way to do so in my tasks by passing a root namespace and a helper function into context:

from invoke import ctask, Collection
from invoke.executor import Executor

# Define tasks
# ============

@ctask
def test_task1(context, arg1):
    print("TASK1")

@ctask
def test_task2(context):
    print("TASK2 BEGIN")
    context.invoke_execute(
         context,
         'test_task1',
         arg1='value1'
    )
    print("TASK2 END")

# Define and configure root namespace
# ===================================

# NOTE: `namespace` or `ns` name is required!
namespace = Collection(
    test_task,
    # put tasks or nested collections here
)

def invoke_execute(context, command_name, **kwargs):
    """
    Helper function to make invoke-tasks execution easier.
    """
    results = Executor(namespace, config=context.config).execute((command_name, kwargs))
    target_task = context.root_namespace[command_name]
    return results[target_task]

namespace.configure({
    'root_namespace': namespace,
    'invoke_execute': invoke_execute,
})
pdonorio commented 7 years ago

Sorry for bothering. This issue is 3 years old, I was wondering if we've reached better ways now to call tasks from other tasks. Thanks!

bitprophet commented 7 years ago

No, but it remains at or near the top of the priority list; it's definitely something that needs solving before 1.0.

TimotheeJeannin commented 6 years ago

Any updates on this ?

pdonorio commented 6 years ago

it's definitely something that needs solving before 1.0.

@bitprophet I see that 1.0 was released :)

muhammedabad commented 6 years ago

With the release of 1.0, is the ability to call tasks from within tasks (similar to v1's execute) available ?

EDIT: If the answer to the original question is yes, please also advise if the current task context object can be passed through to other called tasks.

bitprophet commented 6 years ago

Re: @muhammedabad's 2nd question, see #261 - it needs work still!

Re: this ticket and 1.0: hey, running these projects isn't a cakewalk! ;) I judged it was better to get this & related projects above-board and on semver, than to continually punt until it was perfect.

My gut says we can definitely implement this in a backwards compatible manner, since it's likely to end up implemented as a new method on Context (plus related changes to things like Executor). So it should appear in some 1.x feature release!

SamuelMarks commented 6 years ago

Any updates?

I've written a set of configuration management tools around Fabric (and Apache Libcloud).

But can't upgrade it to be Python 2.7-3+ compatible until I upgrade Fabric. But that requires execute at a minimum.

Roadmap?

jmsuzuki commented 6 years ago

Any updates?

I've written a set of configuration management tools around Fabric (and Apache Libcloud).

But can't upgrade it to be Python 2.7-3+ compatible until I upgrade Fabric. But that requires execute at a minimum.

Roadmap?

I'm in the same boat. I need execute to migrate from python 2.7 to 3+

rectalogic commented 5 years ago

Just in case somebody needs a workaround to execute tasks from another tasks right now, I implemented a really simple way to do so in my tasks by passing a root namespace and a helper function into context:

To get this to work with fabric, and honor the executed tasks host list, I had to from fabric.main import program and pass program.core to the Executor:

from invoke import Collection
from fabric import task
from fabric.executor import Executor
from fabric.main import program

# Define tasks
# ============

@task(hosts=["localhost"])
def test_task1(context, arg1):
    print("TASK1", context)

@task
def test_task2(context):
    print("TASK2 BEGIN", context)
    context.invoke_execute(
         context,
         'test_task1',
         arg1='value1'
    )
    print("TASK2 END")

# Define and configure root namespace
# ===================================

# NOTE: `namespace` or `ns` name is required!
namespace = Collection(
    test_task1,
    test_task2,
    # put tasks or nested collections here
)

def invoke_execute(context, command_name, **kwargs):
    """
    Helper function to make invoke-tasks execution easier.
    """
    results = Executor(namespace, config=context.config, core=program.core).execute((command_name, kwargs))
    target_task = context.root_namespace[command_name]
    return results[target_task]

namespace.configure({
    'root_namespace': namespace,
    'invoke_execute': invoke_execute,
})
$ fab test-task2
('TASK2 BEGIN', <Context: <Config: {'root_namespace': <Collection None: test-task1, test-task2>, 'tasks': {'search_root': None, 'collection_name': 'fabfile', 'dedupe': True, 'auto_dash_names': True}, 'run': {'shell': '/bin/bash', 'hide': None, 'pty': False, 'encoding': None, 'in_stream': None, 'replace_env': True, 'echo': False, 'warn': False, 'echo_stdin': None, 'watchers': [], 'env': {}, 'out_stream': None, 'err_stream': None, 'fallback': True}, 'timeouts': {'connect': None}, 'sudo': {'password': None, 'prompt': '[sudo] password: ', 'user': None}, 'inline_ssh_env': False, 'port': 22, 'load_ssh_configs': True, 'user': 'cureatr', 'ssh_config_path': None, 'invoke_execute': <function invoke_execute at 0x7f1993b41f50>, 'connect_kwargs': {}, 'forward_agent': False, 'gateway': None, 'runners': {'remote': <class 'fabric.runners.Remote'>, 'local': <class 'invoke.runners.Local'>}}>>)
('TASK1', <Connection host=localhost>)
TASK2 END
breisig commented 5 years ago

Any update on this?

SamuelMarks commented 5 years ago

I've switched to the Python 3 compatible fab-classic in the meantime.

SamuelMarks commented 4 years ago

Another year has gone by… any update?

christian-intra2net commented 3 years ago

Also greatly missing this feature. I have some subtasks that I would like to either call directly from the command line or from other tasks.

jcw commented 3 years ago

Would this feature also help with the following scenario I'm after?

So the question is whether invoke can be made to repeatedly re-launch a task. Or is there some other way? (with apologies if I'm mis-reading the gist of this issue)

SamuelMarks commented 3 years ago

@jcw - I think you want https://github.com/gorakhargosh/watchdog

SamuelMarks commented 2 years ago

Another year has gone by… any update?