copier-org / copier

Library and command-line utility for rendering projects templates.
https://readthedocs.org/projects/copier/
MIT License
1.92k stars 173 forks source link

Files no longer update if user deletes them. #1735

Open msharp9 opened 3 weeks ago

msharp9 commented 3 weeks ago

Describe the problem

If an end user deletes a file, the file will not update.

Here's a common scenario. My template uses tool A. Tool A has a config file to customize feature A. Feature A is optional, so user deletes config file. Tool A upgrades to include feature B. Feature B is critical for future scope of template. I update config file to use feature B. But now because the user deleted the config file, they don't get the updates on update and their repo is now broken and they don't why. They now have to recopy which adds a lot of pain.

Template

n/a

To Reproduce

No response

Logs

No response

Expected behavior

Whether or not a file should be updated should be determined by the template, not the end user. The template should determine if a file should be excluded, skipped, or otherwise ignored. Otherwise the end user will likely run into situations where their repo breaks and they won't know why. Unless of course, they specifically and intentionally exclude it in the update cli command

Screenshots/screencasts/logs

No response

Operating system

macOS

Operating system distribution and version

n/a

Copier version

n/a

Python version

n/a

Installation method

pipx+pypi

Additional context

This PR https://github.com/copier-org/copier/pull/1719 seems to be merged simply because it was unexpected to one person, but it seems very expected behavior to me. However, with it's introduction it is now impossible for the owner of the repo to force a file to be updated (even when those updates include breaking changes) that the end user deleted which seems incorrect. This will cause lots of headache for the template owner as users break their repos and submit tickets when they don't understand why.

Control should flow downstream.

pawamoy commented 3 weeks ago

That's an expectation mismatch between the template writer and the template user. As a user, I want all my committed modifications to be re-applied when updating my project. That applies to deleted files too. As a template writer, you want to force a particular thing onto the user (for their own good), but that goes against what the user wants.

In this particular case, I think you can write a migration? The migration would simply write this file if it doesn't exist? If it still exists then the update will do its job, and no further action is required.

But then, yeah, how to anticipate which files the user could have deleted... I don't think template writers can anticipate that.

If feature A is optional, maybe you should add a question to the template whether the user wants to use it, to avoid creating it if they don't want it, so that they don't have to delete the file themselves afterwise, committing the deletion to their history.

msharp9 commented 3 weeks ago

I understand your point on expectations, but as a template writer I'm going to use a tool that gives me the most control. As a user, I'm going to choose a template that doesn't break. These will come before any other expectations.

Your over complicating things. Just revert to previous behavior, and introduce the exclude_on_update feature. The only reason we added this was because a template owner doesn't have a way to control what is optional right now. A question to the template doesn't work in situations where you want it generated on copy, but deleted after generation. AKA, feature A isn't optional on start up, but is optional once a project is up and running.

pawamoy commented 3 weeks ago

Then I'm going to quote you:

Do we now need a feature to ensure we always regenerate all the files when we make an update? This all seems silly.

I don't think this is silly. If that's what you need, I could see a "regenerate deleted files" opt-in mode, configurable on the template side, per file. That initially leaves control to the user, and lets the template writer regain control once a file that was initially optional becomes required.

The previous behavior was correct. The files should exist or not based purely on the template, not the actions of the end user.

I strongly disagree, but that's simply because we have different use-cases.

pawamoy commented 3 weeks ago

Or maybe this is solvable with resolution conflicts? I'm not too active on this project these days so I'll let other more active contributors/maintainers comment, they will have more reasonable insights :sweat_smile:

lkubb commented 3 weeks ago

because it was unexpected to one person

The issue was triaged and found to be valid.

Here's a common scenario. [...] They now have to recopy which adds a lot of pain.

Is this specific scenario really more common than the file just receiving minor updates from time to time? Because the irregular regeneration of unwanted files during updates adds friction as well. On the other hand, when you notice something is broken, it is obvious and you can just revert or recopy.

even when those updates include breaking changes

I can see the need for some way to declare that a file must be present after update because its significance has changed.

Otherwise the end user will likely run into situations where their repo breaks and they won't know why.

They can revert the update and investigate why it breaks. Maybe there should be an equivalent command to cruft diff, which yields a diff between what the project would look like without customizations and its current state to aid in debugging.

Whether or not a file should be updated should be determined by the template, not the end user. The template should determine if a file should be excluded, skipped, or otherwise ignored.

We are likely coming at this from different angles, but I don't see the need for patronizing users at all. They are responsible for the final project.

Control should flow downstream.

If you want control to strictly flow downstream, don't support updates and instruct your users to use (re)copy for updates. The intention of update is to reapply a user's changes on top of an updated template.

msharp9 commented 3 weeks ago

You guys bring up great points, and alternative solutions.

Is this specific scenario really more common than the file just receiving minor updates from time to time?

For the templates I manage yes. Generally updates to the template reflect updated best practices, new workflows, or updates to tooling. Minor updates are pretty non-existent. Why are the template owners creating a bunch of files the users are just going to delete?

Also, I'm not patronizing the users. As a user I want to see all the updates when I run update so I can accept or reject. I want to be in full control. Even if a previous decision led me to delete a file. I would never expect to not see an update. I would never want any update to be automatically rejected. I would be confused. I wouldn't expect any extra steps other than copier update then review the updates. I wouldn't want to have to run copier update, then copier diff, then go to the template repo and copy down the file, or have to copy the repo and have to start over from scratch.

lkubb commented 3 weeks ago

Is this specific scenario really more common than the file just receiving minor updates from time to time?

For the templates I manage yes.

This might partly explain our different points of view.

Let me provide a concrete example:

A template I'm maintaining, which generates a project structure for a plugin, uses pylint in a pre-commit hook. There's a .pylintrc as well.

A child project decides it does not want to use pylint for some reason (maybe it wants to use flake8), thus removes the pre-commit hook and the file. In a future template iteration, a Pylint config value is changed for some reason. The child project updates.

What happened before the patch you're objecting to:

Why are the template owners creating a bunch of files the users are just going to delete?

I don't see the relevance for the expected update behavior. If the user decides to delete a file, it's their choice. They can also break generated projects by introducing bugs into templated files, which are also kept during updates.

As a user I want to see all the updates when I run update so I can accept or reject.

This might be another reason for our difference of opinion. In general, please see the above example. I wouldn't expect to keep denying updates for a tool's configuration file I'm not using. This behavior introduces fatigue.

I'm also coming from an automation pov. I can't automate copier update reliably this way. When an update breaks CI, I can take a closer look though.

I wouldn't want to have to run copier update, then copier diff, then go to the template repo and copy down the file, or have to copy the repo and have to start over from scratch.

We're talking about a very specific scenario here.

  1. A user deletes a file in a generated project, probably having a good reason to do so.
  2. A template's new release now requires this specific file that was optional before.

This should happen very seldomly, even if (2) is a common occurrence for your template. Only in this case would one have to debug the update. You don't have to copy down the file from the template repo or restart from scratch, just copier recopy, git add the missing file and reset all unstaged changes (or cherry-pick the output of copier diff - if it were an available command - and git apply -R). Again, this is likely an edge case.

msharp9 commented 3 weeks ago

I'm assuming if you are using one linter and they are using another, it's going to cause a lot of conflicts throughout all of the template files and cause a lot more update issues than just this one file. Unless none of those files generated are using a linter, in which case why not just give them an option to pick a linter? Add a copier answer for linter: pylint, flake8, none. Then you can handle the creation of this file and updates there.

Is there a reason the users need to delete the .pylintrc at all? It isn't going to harm anything since it's specific to a tool they aren't using.

Is there a reason they don't just fork your template and start their own using tooling they prefer? Picking your tooling is half the reason to choose a template, so if they don't like the tools used, just create another template.

If they know they are not using certain aspects of your tool, why don't they just exclude the file on update. copier update --exclude .pylintrc. They could always create an alias to make this easy. I mean, generally in a copier repo, there's a contributing section in the README or docs that explains how to run copier update and any extra parameters expected.

Sorry, ignore all that, I know you guys said you triaged the issue. I don't need to know the details, even if I feel like there were several reasonable issues.

What I'm worried about is not just cases where updating a config fails obviously, but when it also fails silently. For examples, there are configs like datadog.service.yaml where scanners pick them up, deleting the file would cripple the repos monitoring set up. This is already pretty bad, but imagine CI/CD related files, .github action files, security files, etc. Checks, security scans, and other tooling may be broken but because they are no longer running you wouldn't know they are failing. I could see how you wouldn't care about a linter's config file being removed, but these are different.

The point is, I have need of deleted files to be updated, and now that feature has been removed.

lkubb commented 3 weeks ago

What I'm worried about is not just cases where updating a config fails obviously, but when it also fails silently.

These examples are not specific to updates though. Even with the previous behavior, a user could still delete the file you're deeming important and not be reminded until the file changes in the template.

The point is, I have need of deleted files to be updated, and now that feature has been removed.

I would suggest creating a feature request in this case. If the above example is a concern for you, maybe a recreate_always setting (list of path patterns) could address your worries more generally and would also address the case where a previously optional file becomes required.

yajo commented 2 weeks ago

Let me just put another use case on the table. This template for an Odoo module generates a lot of subfolders (controllers, demo, examples, templates, wizards...). Most times, the module is simple enough to not need any of those. Thus, if I wanted to support updates (which I don't), then if the user deleted those files and they were being recreated some times, that'd be a big UX problem.

IIUC, the main problem is that the user deleted a file they shouldn't. Have you tried CODEOWNERS? It is IMHO a better way to control that.

maybe a recreate_always setting (list of path patterns) could address your worries more generally and would also address the case where a previously optional file becomes required.

That sounds like a valid idea. Another possibility would be a flag such as copier update --recopy-deleted. It depends if the control is on the template or on the user.

They now have to recopy which adds a lot of pain.

FWIW you can recopy only the files you deleted:

copier recopy --exclude '**' --exclude '!the-file-deleted' .
msharp9 commented 2 weeks ago

I've thoroughly been convinced that not wanting files to be recreated on update is desirable, even as the default behavior. Seems like a more common problem. There are still times we want files to be recreated on update. There have been several good suggestions.

The CODEOWNERS solution doesn't work b/c we are assuming a file that made sense to be deleted early in the project, but is required later on in the project.

FWIW you can recopy only the files you deleted:

Handy. Not exactly a solution though since it is controlled from the user's side and requires the user to realize the reason their repo broke is because of a deleted file.