pydoit / doit

CLI task management & automation tool
http://pydoit.org
MIT License
1.84k stars 175 forks source link

CmdAction calls callable more than once #437

Open oliverdain opened 2 years ago

oliverdain commented 2 years ago

Describe the bug

Given a simple dodo.py like this:

from doit.action import CmdAction

def action_thing():
    print('I was called.')
    return 'echo hello world!'

def task_testing():
    return {
        'actions': [
            CmdAction(action_thing)
        ],
        'verbosity': 2,
    }

You can see that action_thing gets called twice (the print line is run twice). The root cause appears to be that CmdAction.action is defined as:

    @property
    def action(self):
        if isinstance(self._action, (str, list)):
            return self._action
        else:
            # action can be a callable that returns a string command
            ref, args, kw = normalize_callable(self._action)
            kwargs = self._prepare_kwargs(self.task, ref, args, kw)
            return ref(*args, **kwargs)

so a self.action call actually runs the callable. And self.action is called in several places. For example, line 283 is:

if isinstance(self.action, list):

This can be problematic in some circumstances. For example, I have a callable that makes a REST call to get a version number and then uses that to do some other things. Obviously I don't want to make that call twice.

Environment

  1. OS: Linux (Debian 11)
  2. python version: 3.9
  3. doit version: 0.36.0

suggestion: call the callable once and cache the result. A @functools.cache might do the trick.

Fund with Polar

oliverdain commented 2 years ago

I just tried to add a @functools.cache to the callable I was passing to CmdAction but that doesn't work. That's because doit forks - you can see that it's still called twice but with each call the PID is different. That's going make this harder to fix.

jgrey4296 commented 1 year ago

For my own action subclass to mitigate that, I just call self.action first, and then handle that result as a named variable. It won't stop multiple processes calling the function, but it does remove the basic double calling.

    def expand_action(self):
        """Expand action using task meta informations if action is a string.
        Convert `Path` elements to `str` if action is a list.
        @returns: string -> expanded string if action is a string
                    list - string -> expanded list of command elements
        """
        action_prepped = self.action

        if not self.task:
            return action_prepped

        if isinstance(action_prepped, list):
            # cant expand keywords if action is a list of strings
            action = []
            for element in action_prepped:
                if isinstance(element, str):
                    action.append(element)
                elif isinstance(element, PurePath):
                    action.append(str(element))
                else:
                    msg = ("%s. CmdAction element must be a str "
                            "or Path from pathlib. Got '%r' (%s)")
                    raise InvalidTask(msg % (self.task.name, element, type(element)))
            return action

        subs_dict = {
            'targets'      : " ".join(self.task.targets),
            'dependencies' : " ".join(self.task.file_dep),
        }

        # dep_changed is set on get_status()
        # Some commands (like `clean` also uses expand_args but do not
        # uses get_status, so `changed` is not available.
        if self.task.dep_changed is not None:
            subs_dict['changed'] = " ".join(self.task.dep_changed)

        # task option parameters
        subs_dict.update(self.task.options)
        # convert positional parameters from list space-separated string
        if self.task.pos_arg:
            if self.task.pos_arg_val:
                pos_val = ' '.join(self.task.pos_arg_val)
            else:
                pos_val = ''
            subs_dict[self.task.pos_arg] = pos_val

        if self.STRING_FORMAT == 'old':
            return action_prepped % subs_dict
        elif self.STRING_FORMAT == 'new':
            return action_prepped.format(**subs_dict)
        else:
            assert self.STRING_FORMAT == 'both'
            return action_prepped.format(**subs_dict) % subs_dict

https://github.com/jgrey4296/doot/blob/master/doot/utils/TaskExt.py

frozax commented 1 year ago

Just stumbled onto this issue as well. I cached the call myself but it would be better and more logical if the callable wasn't called multiple times in my opinion.

    def _callable():
        # doing many stuff that should not be called twice
        return the_computed_cmd_line

    # used to avoid running twice in _callable
    cmd_line = None
    def _get_cmd_line():
        nonlocal cmd_line
        if not cmd_line:
            cmd_line = _callable()
        return cmd_line

    task["actions"] = [CmdAction(_get_cmd_line)] # instead of using _callable directly
pg42819 commented 2 months ago

Definitely a bug, though, right? 2 years later - is there any plan to fix this?