conan-io / conan

Conan - The open-source C and C++ package manager
https://conan.io
MIT License
8.22k stars 979 forks source link

[feature] Patch scripts #14481

Closed iskunk closed 1 year ago

iskunk commented 1 year ago

What is your suggestion?

Sometimes, the easiest way to patch source code prior to compilation is not with a patch file, nor a patch string, but with a script that has the effect of modifying the source in the desired way.

For example, imagine a C source tree that uses class as a variable name, and thus cannot be compiled by a C++ compiler. You need to change every instance of class into e.g. klass. There are many instances of this variable. A patch would be large and difficult to review. But if you could run a shell command...

find . -name '*.[ch]' -exec perl -pi -e 's/\bclass\b/klass/g' {} +

...you could potentially solve the problem in one line. (Perhaps some additional logic is needed to handle exceptions to this replacement, but overall it's still going to weigh in significantly smaller than the patch.)

Being able to specify such a patch script in the conandata.yml file would be quite useful. I already have patches with header text like "The modifications in this patch were generated by the following shell command: ...".

As I see it, a patch script can take one of three forms:

  1. A Bourne shell one-liner, like the example above;
  2. A multi-line script, with a shebang line that can invoke any interpreter;
  3. A path to an external script in the package's exports, possibly with arguments, that can invoke any interpreter. (More or less a special case of # 1.)

Security may be a concern, but I don't see this as much different than the Turing-complete logic that is invoked in the course of a build anyway.

Have you read the CONTRIBUTING guide?

memsharded commented 1 year ago

Hi @iskunk

Thanks for your suggestion. I guess that you mean adding these for the apply_conandata_patches() tool?

That tool is mostly for ConanCenter patches, so this should be also checked with the team.

iskunk commented 1 year ago

I guess that you mean adding these for the apply_conandata_patches() tool?

Yes, I should have clarified; I am following the "private Conan Center" workflow.

Hope the CC folks go for this. I've long used Perl one-liners as a pre-build "patch" mechanism, and it works very well. Patch files, in my view, are better suited for recording and applying hand-crafted edits.

thorsten-klein commented 1 year ago

Alternatively you could solve this by a self.run() during source()

e.g.

def source(self):
    zip_name = f"poco-{self.version}-release.zip"
    download(self, f"https://github.com/pocoproject/poco/archive/poco-{self.version}-release.zip", zip_name)
    unzip(self, zip_name)
    self.run("find . -name '*.[ch]' -exec perl -pi -e 's/\bclass\b/klass/g' {} +")

or

def source(self):
    ...
    self.run("./my-patch-script.sh")
iskunk commented 1 year ago

The point of this is to have patch scripts organized and applied as peers to regular patch files in the existing framework. It would be poor form to have patch files in one place, and patch scripts in another, when they ultimately serve the same purpose.

memsharded commented 1 year ago

Hi @iskunk

I have been discussing this proposal with the team, and the main problem is that this is something that we would need to actively forbid in ConanCenter, as patches applied this way could be quite problematic and challenging to review, the team is quite concerned about this, so not allowing this at the moment in ConanCenter.

While we understand the use case, it seems this would still be a niche request for adding to apply_conadata_patches(), and as it is already necessary to add the thing to conandata.yml, adding a for in the recipe that iterate the patches and conditionally run self.run() as @thorsten-klein is suggesting would be relatively simple. While adding this as a built-in feature could be more work than it seems on the surface, for example, if using bash-isms but the recipe wants to be used in Windows and users are using msys2 for that, the problem is that tool_requires like msys2 are not applied in the source() method, so it will not be injected and it will not be possible to make it work.

So many thanks for your suggestion, but at the moment it will not be considered for addition to the apply_conandata_patches(). I suggest closing the ticket as not planned, but if there is more evidence in the future of more and more users with similar requests, we can re-open and re-consider it. Thanks a lot for the feedback!

iskunk commented 1 year ago

Thank you for your consideration @memsharded, I will work on a private implementation of this idea.

One thing to note, however:

patches applied this way could be quite problematic and challenging to review

For certain types of changes, it is entirely possible to have the edits resulting from a find(1)+Perl one-liner yield a multi-megabyte patch. And that patch might need to be regenerated with each new version of the upstream source, as things shift around.

Reviewing such a "patch script" is going to be different from reviewing a normal patch, but there is clearly a point where it wins out on time and tedium. Keep that in mind, and I'll be happy to revisit this topic if it comes up again.

memsharded commented 1 year ago

For certain types of changes, it is entirely possible to have the edits resulting from a find(1)+Perl one-liner yield a multi-megabyte patch. And that patch might need to be regenerated with each new version of the upstream source, as things shift around.

That is exactly the point. If there is a huge patch file, the team want it very evident, and it will be probably rejected. Source code changes are a red flag to be reviewed carefully in ConanCenter, and most other build system patches should be as minimal as possible. It is possible to do large changes in codebases with very small scripts that can be skipped/missed because the reviewers are usually overwhelmed, and the idea is that diffs are more visually impacting than small scripts

iskunk commented 1 year ago

That is a site-policy matter, however. Whether or not ConanCenter would use this feature is distinct from whether it is generally useful enough (for other sites with different policies) to be in Conan proper. In my org's use case, new/updated patch scripts will not be a frequent occurrence, and of course they will be carefully reviewed.

(BTW, I've found that bolting on this functionality externally is awkward due to apply_conandata_patches() requiring every entry to be a patch_file or patch_string, with no allowance for something it doesn't recognize. If I could get in something like a patch_other keyword that allows the entry to be silently ignored, that would be helpful.)

Wanting a large visual impact for a large change is reasonable, but that doesn't necessarily have to come from a physically large patch file. For example, Conan could list all files with updated timestamps after the script is run, or even do a recursive diff against a copy of the original tree. (It'll be slow, for sure---but if a user is really worried about the script not behaving as expected, it could still be a time-saver over maintaining patches.) There are other ways to scratch that itch without having to record the changes explicitly.

valgur commented 1 year ago

You can already accomplish the same task perfectly well in a conanfile.py, no? The possible new mechanism would not have much benefit besides a relatively minor convenience for users preferring shell scripts over Python. But the latter is generally easier to read and definitely more portable anyway.

    def _patch_sources(self):
        apply_conandata_patches(self)
        for path in self.source_path.rglob("*.[ch]"):
            content = path.read_text()
            content = re.sub(r"\bclass\b", "klass", content)
            path.write_text(content)

    # or

    def _patch_sources(self):
        apply_conandata_patches(self)
        count = 0
        for path in self.source_path.rglob("*.[ch]"):
            content = path.read_text()
            content, n = re.subn(r"\bclass\b", "klass", content)
            if n > 0:
                path.write_text(content)
                count += 1
        self.output.info(f"Patched 'class' to 'klass' in {count} source files")

    def build(self):
        self._patch_sources()
        ...
memsharded commented 1 year ago

(BTW, I've found that bolting on this functionality externally is awkward due to apply_conandata_patches() requiring every entry to be a patch_file or patch_string, with no allowance for something it doesn't recognize. If I could get in something like a patch_other keyword that allows the entry to be silently ignored, that would be helpful.)

Maybe this is something that could make sense, I'll re-open to propose to the team and see what they think.

iskunk commented 1 year ago

Hi @valgur,

You can already accomplish the same task perfectly well in a conanfile.py, no?

The point is not "modify sources programmatically in Conan in some/any way"; the point is "modify sources programmatically in Conan in the same workflow as patch files." This will eventually form part of a process in a larger organization, and in that context, handling patch scripts as peers to patch files/strings (i.e. that they are both recorded in the same place, and can be inventoried/maintained/audited in the same place) is critically important.

(Also, note that your sample code doesn't account for different package versions needing different edits. Once you add support for that, you can see that you're starting to recreate the patches section of conandata.yml, but in a non-standard, procedural manner.)

Maybe this is something that could make sense, I'll re-open to propose to the team and see what they think.

I am much obliged, @memsharded. A few notes for consideration:

memsharded commented 1 year ago

I am proposing https://github.com/conan-io/conan/pull/14576 to define patch_user in conandata.yml and allow apply_conandata_patches() to not crash, and letting users iterate the patches and apply themselves in the recipe those special patches.

memsharded commented 1 year ago

https://github.com/conan-io/conan/pull/14576 closed this issue, to be released next 2.0.10

iskunk commented 1 year ago

Thanks @memsharded. Now I can make my function a supplement to apply_conandata_patches(), rather than a replacement. I've moved my keywords around a bit to have patch_user entries, with additional keys indicating different variations of same.

For reference, here are my supplements to {apply,export}_conandata_patches(), to be called after the corresponding function. (Most of the error reporting is removed, as it is redundant at the point when these run. The patch_user() function is left as an exercise for the reader, but I can provide my implementation of the patch-scripts concept if desired.)

from conans.util.files import mkdir

# Supplement to apply_conandata_patches() that supports "patch_user" entries
#
def apply_conandata_patches_supp(conanfile):
    patches = conanfile.conan_data.get('patches')
    if patches is None:
        return
    if isinstance(patches, dict):
        entries = patches.get(str(conanfile.version), [])
    elif isinstance(patches, list):
        entries = patches
    for it in entries:
        if "patch_user" in it:
            patch_user(conanfile, **it)

# Supplement to export_conandata_patches() that supports "patch_user" entries
#
def export_conandata_patches_supp(conanfile):
    patches = conanfile.conan_data.get('patches')
    if patches is None:
        return
    if isinstance(patches, dict):
        entries = patches.get(conanfile.version, [])
    elif isinstance(patches, list):
        entries = patches
    for it in entries:
        if "patch_user" in it:
            script = it.get("script")
            if script and not script.startswith("#!"):
                src = os.path.join(conanfile.recipe_folder, script)
                dst = os.path.join(conanfile.export_sources_folder, script)
                mkdir(os.path.dirname(dst))
                shutil.copy2(src, dst)