Paradigm to update the scaffold files created by `init` or `initialize`: command for `init --update` or similar

meltano / edk

Meltano extension development kit

https://edk.meltano.com

Apache License 2.0

10 stars 3 forks source link

Paradigm to update the scaffold files created by `init` or `initialize`: command for `init --update` or similar #65

Open aaronsteers opened 1 year ago

aaronsteers commented 1 year ago

Two considerations here we'd want to solve for in a generic way if possible:

Decision 1. What logic to use when choosing whether to replace, ignore, or merge changes for files that already exist in their destinations.

Parity with files bundles would simply be the ability to specify an array of files to update. Those files specified would overwrite any existing files.

No smart merge logic exists as of today, but something like copier is interesting in how it attempts to apply a mergeable diff patch. (Probably more scope that is feasible in a first iteration.)

Decision 2: What to call the command or subcommand used to update the files or project scaffold

A starting proposal would be to add a flag after the init or initialize command which is already being used to install the files the first time. I think I like --update best personally, but I'll list a few ideas / thought experiments below.

# Init projects files (first time)
my-plugin init                # installs the extension's files

# Init again with the 'update' flag
my-plugin init --update       # updates the extension's files 

# Alternatively:
my-plugin init --replace      # re-initialize the extension's files
my-plugin init --force        # initialize the extension's files (force override)
my-plugin init --interactive  # ask for each file that would be overridden
my-plugin init --prompt       # ask for each file that would be overridden

https://github.com/meltano/edk/issues/66

tayloramurphy commented 1 year ago

@aaronsteers is there a generic my-plugin update command that we could leverage too? Seems like it'd be nice to just plug into that. Nothing about the init interface indicates just files to me. Something like my-plugin update --files --force seems reasonable, right?

aaronsteers commented 1 year ago

@aaronsteers is there a generic my-plugin update command that we could leverage too?

There is not. And I would not want to have a generic update command because it can't perform the actions one might expect from it. Specifically, meltano invoke airflow update should do what a command like meltano plugin update airflow would do - but the extension is not able to update itself, and shouldn't do so because it would very likely break the venv that Meltano is maintaining.

But please don't read from the above that each individual dev could not provide something like cron-ext update which means something specific and explicit to that plugin. But in that sense, update probably applies to artifacts generated by the plugin, and not to the plugin itself - and not necessarily the file scaffold of the plugin. (In the cron example,update probably means to update the output of the plugin, not the plugin's scaffold or the plugin itself.)

What about adding a command group like project or files?:

# Init or attempt to update the project scaffold for this plugin:
my-ext project init
my-ext project update

# Init or attempt to update the project files for this plugin:
my-ext files init
my-ext files update

Then, for backwards compatibility and convenience, my-ext init or my-ext initialize would be mapped as shorthand for my-ext project init or my-ext files init.

tayloramurphy commented 1 year ago

@aaronsteers I like the explicitness of command group proposal.

ReubenFrankel commented 1 year ago

my-ext files init
my-ext files update

Is this not just the same as something like

my-ext files copy
my-ext files copy --force

? Not sure initing files makes sense to me as a concept...

Given that the copy command may call something like ExtensionBase.copy by default, it would be pretty easy to just call ExtensionBase.copy from ExtensionBase.initialize if you wanted to run the copy on extension initialisation - could be something for the EDK docs, a template optional TODO or provided as out-the-box behaviour.

I'm working on an implementation of this at the moment, so curious what your thoughts are. :)

copier_template/{{library_name}}/files.py.jinja

from pathlib import Path

import typer

import {{ library_name }}.main as main

FILENAME_BLACKLIST = (
    "__init__.py",
    # for editable install only, otherwise covered by pyproject.toml exclude
    "__pycache__",
)

app = typer.Typer(
    name=Path(__file__).stem,
    help="{{ extension_name }} plugin file operations.",
    no_args_is_help=True,
)

@app.command()
def copy(
    force: bool = typer.Option(False),
):
    """Copy the plugin files."""
    main.ext.copy(
        "{{ files_package_name }}",
        force=force,
        filename_blacklist=FILENAME_BLACKLIST,
    )

aaronsteers commented 1 year ago

my-ext files init
my-ext files update
Is this not just the same as something like
my-ext files copy
my-ext files copy --force
? Not sure initing files makes sense to me as a concept...

The init command makes most sense within the construct of a very common trend for tools having an init command that creates the initial project scaffold that the tool needs. Examples are meltano init, dbt, airflow db init, great_expectations init, etc.

We're retroactively suggesting a files command groups so that update can have a valid context if the created scaffold is updateable - but you are correct in calling out that meltano dbt-duckdb files init is probably less intuitive than something like meltano dbt-duckdb init.

To me at least, copy seems to ask for a source and target location, but maybe there's another option here if we let go of files init and if we're okay using files as the command group.

While I would avoid install as the verb as a top-level command, because it implies something that isn't relevant or even possible from that context, we could consider my-ext files install - since that is (to me at least) perfectly clear from context about what is and is not occuring.

So, on thinking more about the paradigm here, I kind of like this approach:

my-ext files install   # first-time install of the project scaffold.
my-ext files update    # if the extension supports it, update the files scaffold.

And for extensions which already support my-ext init or my-ext initialize, these can become aliases of my-ext files install. Importantly, this is not a top-level install command, and doesn't imply that we are doing anything to the plugin's venv or to the tool's installed versions.

Wdyt?

ReubenFrankel commented 1 year ago

Fully onboard with the files command group concept. I agree it makes it clear to a user what context they are operating in.

To me at least, copy seems to ask for a source and target location,

Agreed on this in retrospect too - I prefer install as you suggested, but some more discussion on that below.

And for extensions which already support my-ext init or my-ext initialize, these can become aliases of my-ext files install.

I think you are saying here that if an extension has files to install, my-ext initialize should follow the inner implementation of my-ext files install, along with any other initialisation logic. If that's correct, then I also agree again!

my-ext files install   # first-time install of the project scaffold.
my-ext files update    # if the extension supports it, update the files scaffold.

What are the implications of running my-ext files install after the files have already been 'installed' into the project? Should it ignore or overwrite files that already exist at the destination when attempting to 'install' again? Regardless of the behaviour, this feels like more of an application of update, not install (in my opinion) - in which case, why would I need to/should I be able to ever run install for that instance of the extension again?

Maybe I have misunderstood, but having a command that is designed to only ever be run once (manually) per instance of a tool does not feel like a great user experience to me. I don't think it's the same as something like meltano init, dbt init or git init, since you generally install these once and then run init where you want to create a project instance (i.e. more than once per install).

Following an internal example, I think this could work a lot like meltano lock does:

my-ext files install   # install files to the project, skipping any that already exist (could preemptively check for conflicts and fail if there are any before installing anything)
my-ext files install --update    # install files to the project, overwriting any that already exist