wp-cli / dist-archive-command

Create a distribution .zip or .tar.gz based on a plugin or theme's .distignore file
https://developer.wordpress.org/cli/commands/dist-archive/
MIT License
48 stars 24 forks source link

Excluding folders from zip file is broken in main since #61 #67

Closed kraftner closed 2 years ago

kraftner commented 2 years ago

61 seems to have broken excluding folders. https://github.com/wp-cli/dist-archive-command/commit/2d265e4b41f11cd82f62ee7c4e8455e6abead3e6 still works but https://github.com/wp-cli/dist-archive-command/commit/f5b532ced1f86571f7bf73a9ce3d99b601fde949 doesn't. This also isn't covered by the tests which might explain why it went unnoticed.

When we have a .distignore file that contains just .git the .git folder is not excluded from the archive.

Before #61 this would have resulted in the following command being actually run:

 zip -r '/var/www/html/wp-content/themes/myplugin.zip' myplugin  --exclude \*/.git/\*

Now what is run for .git is

 zip -r '/var/www/html/wp-content/themes/myplugin.0.0.1.zip' myplugin  --exclude */.git --exclude */.git/* --exclude */.git/

Also when trying the newly anchored version /.git we get

zip -r '/var/www/html/wp-content/themes/myplugin.0.0.1.zip' myplugin  --exclude myplugin/.git --exclude myplugin/.git/* --exclude myplugin/.git/

The problem seems to be this line: https://github.com/wp-cli/dist-archive-command/blob/55f14e816ba500ecea1b275d5b09bbb202da12bf/src/Dist_Archive_Command.php#L246-L247

When I turn it back into what it was before

https://github.com/wp-cli/dist-archive-command/blob/2d265e4b41f11cd82f62ee7c4e8455e6abead3e6/src/Dist_Archive_Command.php#L208

we get this again which works:

zip -r '/var/www/html/wp-content/themes/myplugin.0.0.1.zip' myplugin  --exclude \*/.git --exclude \*/.git/\* --exclude \*/.git/

So undoing the shell escaping seems to break things. Also I am not sure why this is needed anyway. Also feels risky from a security perspective. Maybe @BrianHenryIE or @schlessera can add some context why this was done in the first place in #61.


While we're at it: There are actually 3 excludes added now. While a single seems to also do the job:

zip -r '/var/www/html/wp-content/themes/romanesco.0.4.1.zip' romanesco --exclude \*/.git/\*

Final note: Since I already spent way more time on it today than I currently have I only tested this on Linux. I also can't find time for a PR right now, but I thought I'd at least leave this here as a brain dump in case someone else hits this as well and/or finds time for a PR before me.

BrianHenryIE commented 2 years ago

This test DOES NOT fail for me on MacOS or Ubuntu:

Scenario Outline: Ignores .git folder
    Given an empty directory
    And a foo/.distignore file:
        """
        .git
        """
    And a foo/.git/version.control file:
        """
        history
        """
    And a foo/plugin.php file:
        """
        <?php
        echo 'Hello world';
        """

    Then the foo/.git directory should exist

    When I run `wp dist-archive foo --format=<format>`
    Then STDOUT should be:
        """
        Success: Created foo.<extension>
        """
    And the foo.<extension> file should exist

    When I run `rm -rf foo`
    Then the foo directory should not exist

    When I try `<extract> foo.<extension>`
    Then the foo directory should exist
    And the foo/plugin.php file should exist
    And the foo/.git directory should not exist

    Examples:
    | format | extension | extract   |
    | zip    | zip       | unzip     |
    | targz  | tar.gz    | tar -zxvf |

Can you modify that to get a failing test please?

Ironically, excluding * from the escaped characters list was introduced for the "Correctly ignores hidden files when specified in distignore" test, for tar to work properly.

At the very least, this can be fixed by conditionally setting the escaped characters for zip vs tar. I haven't properly looked into the duplicate exclusions yet.

kraftner commented 2 years ago

@BrianHenryIE Thanks for reacting so quickly! I am working inside a WordPress Docker Container so maybe something special about the shell/zip in there. Sorry for keeping the issue a bit too vague, but like I've already mentioned I don't have time right now for a more thorough investigation but will try to produce a proper testcase, but probably will take me 1 or 2 weeks.

kraftner commented 2 years ago

Okay, so I finally managed to make this fail @BrianHenryIE . The problem is that having .git in the .distignore only excludes files that are direct children of .git, not subfolders and files in those.

So if you modify the test to create a file inside a subfolder of .git to this And a foo/.git/subfolder/version.control file: it fails.

So, this is a complete test that fails for me, but only for zip:

Feature: Generate a distribution archive of a project

  Scenario Outline: Ignores .git folder
    Given an empty directory
    And a foo/.distignore file:
        """
        .git
        """
    And a foo/.git/version.control file:
        """
        history
        """
    And a foo/.git/subfolder/version.control file:
        """
        history
        """
    And a foo/plugin.php file:
        """
        <?php
        echo 'Hello world';
        """

    Then the foo/.git directory should exist

    Then the foo/.git/subfolder/version.control file should exist

    When I run `wp dist-archive foo --format=<format>`
    Then STDOUT should be:
        """
        Success: Created foo.<extension>
        """
    And the foo.<extension> file should exist

    When I run `rm -rf foo`
    Then the foo directory should not exist

    When I try `<extract> foo.<extension>`
    Then the foo directory should exist
    And the foo/plugin.php file should exist
    And the foo/.git directory should not exist
    And the foo/.git/subfolder directory should not exist
    And the foo/.git/version.control file should not exist
    And the foo/.git/subfolder/version.control file should not exist

    Examples:
      | format | extension | extract   |
      | zip    | zip       | unzip     |
      | targz  | tar.gz    | tar -zxvf |
> run-behat-tests 'features/bug.feature'
...............F---...................

--- Failed steps:

001 Example: | zip    | zip       | unzip     |   # features/bug.feature:47
      And the foo/.git directory should not exist # features/bug.feature:40
        /tmp/wp-cli-test-run--633c3cb37a0374.89179953/foo/.git exists. (Exception)

2 scenarios (1 passed, 1 failed)
38 steps (34 passed, 1 failed, 3 skipped)
0m0.51s (11.14Mb)

Escaping the * fixes it for zip, but breaks it for tar, as expected.


Finally my gut still feels bad running unescaped glob patterns on the shell. I wasn't able to find time to read deeper into how the patterns work for zip and tar, but either quoting or escaping should probably be used for the exclude patterns.

danielbachhuber commented 2 years ago

@kraftner Thanks for the additional discovery! Want to submit a PR with your work?

kraftner commented 2 years ago

Well. We now have a reproducible error, but it is still unclear how to properly solve this in a way that consistently works across tar/zip and Win/Mac/Linux.

Since I am not sure if/when I might find time to work on this I thought I'd report it just in case anybody wants to pick this up before me. In any case I feel like there is nothing yet that is ready for a PR.