google / wireit

Wireit upgrades your npm/pnpm/yarn scripts to make them smarter and more efficient.
Apache License 2.0
6.04k stars 106 forks source link

Clarification of execution order for pre,main,post workflows #1188

Open TyIsI opened 1 month ago

TyIsI commented 1 month ago

As I understand from sources, comments and issues, if I wanted to replace a pre,main,post workflow through a simple placeholder script, I would need to set WIREIT_PARALLEL=1 and have each scripting dependency declare the previous step as a dependency?

Concrete example:

{
  ...
  "scripts": {
    ...
    "predb:update": "pnpm run db:pull",
    "db:update": "pnpm run db:generate",
    "postdb:update": "pnpm run db:post-processing"
    ...
  }
  ...
}

By themselves I could migrate them and I think the appropriate way to convert those would look something like this. But it feels a little bit counter-intuitive?

{
    ...
    "wireit": {
        "db:update:step1": {
            "command": "db:pull"
        },
        "db:update:step2": {
            "dependencies": [
                "db:update:step1"
            ],
            "command": "db:generate"
        },
        "db:update:step3": {
            "dependencies": [
                "db:update:step2"
            ],
            "command": "db:dump"
        },
        "db:update": {
            "dependencies": [
                "db:update:step1",
                "db:update:step2",
                "db:update:step3"
            ]
        }
    }
    ...
}

Is there a better/preferred way to build "meta" scripts like this? Or would a simple entry with just the linked tasks as dependencies work and would they be executed in the correct order based on WIREIT_PARALLEL=1?

Thanks!

EDITED optimized example workflow

aomarks commented 1 month ago

I think you want something like this:

{
  "scripts": {
    "db:update": "wireit",
    "db:generate": "wireit",
    "db:pull": "wireit"
  },
  "wireit": {
    "db:update": {
      "command": "..."
      "dependencies": [
        "db:generate"
      ]
    },
    "db:generate": {
      "command": "..."
      "dependencies": [
        "db:pull"
      ]
    },
    "db:pull": {
      "command": "..."
    }
  }
}

You don't need to modify WIREIT_PARALLEL, because wireit will ensure correct execution order based on your dependencies. In this case, when you run npm run db:update, the order that the commands run will always be db:pull, db:generate, db:update no matter what WIREIT_PARALLEL is set to.

TyIsI commented 1 month ago

Sorry, my bad. I forgot to clarify that I don't always want db:generate to run db:pull. I may want to run db:generate on its own without db:pull. Or as part of a multi step workflow.

Is this something that can be isolated with wireit?

aomarks commented 1 month ago

Ah I see, yeah I missed that detail.

I think for this use-case I would actually just use shell chaining for your macro. Something like this:

{
  "scripts": {
    "db:update": "wireit",
    "db:pull": "wireit",
    "db:generate": "wireit",
    "db:dump": "wireit"
  },
  "wireit": {
    "db:update": {
      "command": "npm run db:pull && npm run db:generate && npm run db:dump"
    },
    "db:pull": {
      "command": "..."
    },
    "db:generate": {
      "command": "..."
    },
    "db:dump": {
      "command": "..."
    }
  }
}
aomarks commented 1 month ago

We could maybe add a feature like that has the same behavior as above, but more concise and without the overhead of those extra processes that each npm run brings. Maybe something like this?:

{
  "scripts": {
    "db:update": "wireit",
    "db:pull": "wireit",
    "db:generate": "wireit",
    "db:dump": "wireit"
  },
  "wireit": {
    "db:update": {
      "sequence": ["db:pull", "db:generate", "db:dump"]
    },
    "db:pull": {
      "command": "..."
    },
    "db:generate": {
      "command": "..."
    },
    "db:dump": {
      "command": "..."
    }
  }
}

Doesn't seem crazy, though I'd like to think a bit more about whether there is some better way to express that.

TyIsI commented 1 month ago

I was kinda thinking about that. Ideally, for something like this, I think it would be amazing to express "downstream" execution dependencies.

Could a sequence flag be a more efficient solution to cut through it? As in, randomization is default unless disabled explicitly? Which would imply sequence mode.

TyIsI commented 1 month ago

Though, now that I've had a chance to look at the code, that would also require having to deal with not just circumventing the shuffle functionality, but also to use a different strategy for iterating over the dependencies, to prevent race conditions from Promise.all.

TyIsI commented 1 month ago

Thanks so far!

TyIsI commented 3 weeks ago

@aomarks I think I may have a rough idea/solution. I'll look into this and if it's viable, I'm willing to commit implementing it.

In short, the current workflow for dependencies stays the same, but if a dependency entry is a string array, each element is then executed sequentially.

Example:

{
  "scripts": {
    "db:update": "wireit",
    "db:pull": "wireit",
    "db:generate": "wireit",
    "db:dump": "wireit"
  },
  "wireit": {
    "db:update": {
      "dependencies": [["db:pull", "db:generate", "db:dump"]]
    },
    "db:pull": {
      "command": "..."
    },
    "db:generate": {
      "command": "..."
    },
    "db:dump": {
      "command": "..."
    }
  }
}

Reflection: not sure how/if that could would work with the current dependency graphs, but I'm imagining that coercing every "dependencies" config entry into an string[][] object and using uniform sequential execution for the second level would alleviate most of the concerns. And I think a bonus would be that this could allow specific dependency workflows that could be executed in parallel. (Think parallel builds with specific execution order requirements.)

Let me know if this sounds like it might be a viable solution and I'll look into this.

(I'm not sure how wireit does dependency lookup resolution, in terms of whether these hard linked dependencies or if these are vectors that can be looked up. But I'll take a look and see.)