SuperchupuDev / tinyglobby

A fast and minimal alternative to globby and fast-glob
https://npmjs.com/package/tinyglobby
MIT License
149 stars 9 forks source link

Potentially the biggest performance improvement that can be done #46

Open SuperchupuDev opened 1 month ago

SuperchupuDev commented 1 month ago

I'm not even sure how to approach the problem, but implementing this means that all of the weird patterns that currently avoid all optimizations would be really optimized along with literally everything else.

By default, fdir crawls all subdirectories and files of a root, which can result in extra processing work that's not necessary, harming performance. tinyglobby tries to apply some optimizations by inferring a common root.

fdir exposes a exclude function that can be used to exclude directories from crawling. It's being currently used on the ignore patterns to... not crawl those ignored patterns?

What if, we took the matching patterns (basically the patterns that aren't meant to be ignored), we did some weird transformations to them, and used them in the exclude matcher?

For example, let's say we have the following usage:

import { glob } from 'tinyglobby';

await glob(['src/files/index.ts', 'scripts/*.ts']);

with the following file structure:

- node_modules
  ^big
- plugins
  - myPlugins
    | plugin.ts
- scripts
  - utils
    | index.ts
  | deploy.ts
  | run.ts
- src
  - files
    | index.ts
  - other
    | index.ts

Basically, we need a picomatch matcher that returns true for every directory that we don't want to crawl, in this case node_modules, plugins, scripts/utils, and src/other. This could be implemented with the following picomatch usage:

import picomatch from 'picomatch';

const exclude = picomatch('**/*', {
  ignore: ['src/files', 'scripts/*/**']
});

Great! It now only crawls the directories needed (hopefully, I haven't checked). Now the question is how to implement something that converts ['src/files/index.ts', 'scripts/*.ts' into ['src/files', 'scripts/*/**'], which is the whole point of this issue. If we figure it out, tinyglobby should be nearly as fast as possible.

Some notes:

webpro commented 4 weeks ago

Knip does something similar, but just for .gitignore files only. Pointers:

The key is to have functionality like this deepFilter. tinyglobby uses fdir which has filter which is maybe what we need here?

webpro commented 4 weeks ago

Scratch that, I mixed up filter and exclude.

Still, the deepFilter example might be useful as I think it does the same as fdir#exclude.