micromatch / picomatch

Blazing fast and accurate glob matcher written JavaScript, with no dependencies and full support for standard and extended Bash glob features, including braces, extglobs, POSIX brackets, and regular expressions. Used by GraphQL, Jest, Astro, Snowpack, Storybook, bulma, Serverless, fdir, Netlify, AWS Amplify, Revogrid, rollup, routify, open-wc, imba, ava, docusaurus, fast-glob, globby, chokidar, anymatch, cloudflare/miniflare, pts, and more than 5 million projects! Please follow picomatch's author: https://github.com/jonschlinkert
https://github.com/micromatch
MIT License
972 stars 56 forks source link

different results picomatch vs minimatch #136

Open ghiscoding opened 1 month ago

ghiscoding commented 1 month ago

hello, I'm simply trying to migrate from minimatch to picomatch in Lerna-Lite, I didn't write the code myself using these globbing (it came directly from Lerna's author), I only maintain the project as much as I could while trying to modernize it. However trying to switch to picomatch causes some unit tests to fail, with some of the patterns shown below. Relevant code usage can be found here, which again came from the original Lerna's repo

const options = {
   matchBase: true,
   // dotfiles inside ignored directories should also match
   dot: true,
};
const pattern = '!**/examples/**';

const p1 = 'packages/pkg-2/examples/.eslintrc.yaml';
picomatch(pattern, options)(p1); => true
minimatch(p1, pattern, options); => false

const p2 = 'packages/pkg-2/examples/do-a-thing/index.js';
picomatch(pattern, options)(p2); => true
minimatch(p2, pattern, options); => false

const p3 = 'packages/pkg-2/examples/and-another-thing/package.json';
picomatch(pattern, options)(p3); => true
minimatch(p3, pattern, options); => false

const pattern2 = '!*.md';
const p4 = 'packages/pkg-2/examples/do-a-thing/index.js';
picomatch(pattern2, options)(p4); => true
minimatch(p4, pattern2, options); => true

I'm not sure if it's related to #89 but any help would be appreciated to understand why the result are different since I don't use globs that often but I would really like to migrate to a much smaller lib

jonschlinkert commented 1 month ago

Hi @ghiscoding, thanks for creating the issue. It looks like picomatch is correct in those examples. If you set matchBase, you're saying you want to match the file's base name, including file extension, which doesn't include any other path segment.

Let me know your thoughts.

Edit: Also, regarding dot=true, that just means the matcher will recognize dot files. However, I also just noticed that you're using a negation pattern (!), which might mean that those results are wrong. I honestly don't remember what the bash convention is for that, or if there is a bash convention. Either way, I'm interested in your thoughts.

ghiscoding commented 1 month ago

Thanks for the reply, I'm not sure what to reply on this because like I said I'm not super familiar person with globs (when I need it, it's typically super small globs) and that code was implemented by the original author more than 5 years ago and they actually use the minimatch.filter function, but nonetheless, the equivalent code performed is being shown above. I understand what the dot option does, but I don't really know why they used a negative pattern.

This was the original code (more than 5 years ago, which you can find here)

function makeDiffPredicate(committish, execOpts, ignorePatterns = []) {
  const ignoreFilters = new Set(
    ignorePatterns.map(p =>
      minimatch.filter(`!${p}`, {
        matchBase: true,
        // dotfiles inside ignored directories should also match
        dot: true,
      })
    )
  );

  //...

  return changedFiles.length > 0;
}

and if I remember correctly (because I'm on a different computer), here's the 2 tests that fails. With picomatch, the tests below returns true instead of false because the pattern returns some matches with picomatch but nothing with minimatch.

test("ignore changes (globstars)", () => {
  setup([
    "packages/pkg-2/examples/.eslintrc.yaml",
    "packages/pkg-2/examples/do-a-thing/index.js",
    "packages/pkg-2/examples/and-another-thing/package.json",
  ]);

  const hasDiff = makeDiffPredicate("v1.0.0", { cwd: "/test" }, ["**/examples/**", "*.md"]);
  const result = hasDiff({
    location: "/test/packages/pkg-2",
  });

  expect(result).toBe(false);
});

test("ignore changes (match base)", () => {
  setup("packages/pkg-3/README.md");

  const hasDiff = makeDiffPredicate("v1.0.0", { cwd: "/test" }, ["*.md"]);
  const result = hasDiff({
    location: "/test/packages/pkg-3",
  });

  expect(result).toBe(false);
});

Why did the Lerna's author code it that way with the negative, I don't know and I can't find much PR or documentation of why this code is the way it is.

I tend to agree with your on the results but the problem is really that there's a lot of users now moving to Lerna-Lite (a lighter version of Lerna without Nx with 470 stars now) and I'm a little scared to migrate to picomatch if the results are different and I don't understand why they are different and why the code was implemented the way it is. The globbing portion in Lerna/Lerna-Lite is mostly related to ignoring files from being detected as changes and/or versioning/publishing.

Worst case, I just keep on using minimatch, but I'd really like to move to picomatch if possible because I'm just about to merge my other migration from globby to tinyglobby and since tinyglobby uses only 2 deps (picomatch being 1 of them), then I was super interested to move to picomatch....

So if you think that picomatch is correct and minimatch is wrong, then I think that I'll have to stick with minimatch (simply to be on the safer side) and you can close the issue. Sticking with minimatch would be a little sad but it seems safer since I don't have a full understanding of this implementation and I certainly wouldn't like to see users complaining about the new version returning more changes than before and I'm scared to migrate if the tests have different results. Perhaps I could also delay the migration until my next major planned in April/May of next year since I follow NodeJS roadmap.

I actually wonder if the suffix /** is interpreted differently in minimatch vs picomatch?

ghiscoding commented 1 month ago

I also created a Stackblitz to make it easier to see it live https://stackblitz.com/edit/stackblitz-starters-yjniqx?file=index.mjs&view=editor

image

Also now that I have the live code, it's easy to test different code and if I remove the !, I still see picomatch returning the inversed result of minimatch, so the difference is not necessarily related to the negate prefix.

I still wonder about the /** suffix though, is it really the same in both libs? Meaning does ** return folders and/or folders+files in both libs or not? If I remove one * and use this pattern instead !**/examples/*, then there's only 1 that is off (versus 3 before with double ** suffix)

image

jonschlinkert commented 1 month ago

I’ll help you figure it out either way. I’m not at my computer, but I’ll try to look at this tomorrow. Sent from my iPhoneOn Oct 1, 2024, at 5:46 PM, Ghislain B. @.***> wrote: I also created a Stackblitz to make it easier to see it live https://stackblitz.com/edit/stackblitz-starters-tzhyqj?file=index.mjs

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

ghiscoding commented 1 month ago

@jonschlinkert just a friendly ping, to see if we can advance this so that I can eventually migrate to your lib. Thanks