siddharthkp / bundlesize

Keep your bundle size in check
MIT License
4.47k stars 180 forks source link

Allow glob pattern filename mapping to master #208

Open kanzelm3 opened 6 years ago

kanzelm3 commented 6 years ago

Do you want to request a feature or report a bug?

I want to request a feature.

What is the current behavior?

If you use glob-patterns in a path in your configuration, bundlesize is unable to perform a comparison to master if the filename has changed.

If the current behavior is a bug, please provide the steps to reproduce.

Use webpack to bundle your application and include [chunkhash] in the output file name (e.g. 'main.[chunkhash].js'). Use the following bundlesize configuration to analyze the main chunk:

  "bundlesize": [
    {
      "path": "./build/main.*.js",
      "maxSize": "400 kB"
    }
  ]

Now change your javascript and build again (to trigger a [chunkhash] change) and create a PR.

What is the expected behavior?

I would expect that if I use a glob pattern in my path, bundlesize would attempt to dynamically match main.fc12ht2.js with main.32oksadp.js (contrived example hashes haha) using glob pattern match groups. In order to implement this, I am proposing to replace node-glob with micromatch because it has the ability to parse the filenames into an AST. It is also faster and has a more robust feature set.

Parsing the glob match into an AST will allow me to read the text for each of the match groups and attempt to fuzzy map the filenames to the master filenames where one or more of the match groups are equal. If there is no partial filename match in the master values, or there is more than one, I will move onto the next filename and log a debug message stating that I was unable to map the filename to the master.

Examples

Let's say on master my build directory looks like this:

build
- main.a7584eae.js
- 0.22025039.chunk.js
- 1.17d36c31.chunk.js

And on my PR branch it looks like this:

build
- main.e18d7f43.js
- 0.e2984a0e.chunk.js
- 1.936ae43c.chunk.js

For my main bundle, I would have the following glob pattern:

'./build/main.*.js'

Which would have (simplified) match groups of ['./build/main.', '*', '.js'].

When I do a glob match on master I would get ['./build/main.', 'a7584eae', '.js'] in my AST as matches (Again, that is a simplified value, I would have to pull those match group results out of the AST).

When I do a glob match on my PR branch, I get ['./build/main.', 'e18d7f43', '.js'] as matches.

When I compare the two, I get a partial match on ['./build/main.', '.js']! And since the glob does not match any other filenames, I will assume that this is the same file on master, just with a different hash.

'./build/+([0-9]).*.chunk.js'

For this example, my (simplified) match groups would be ['./build/', '+([0-9])', '.', '*', '.chunk.js'].

When I do a glob match on master I would get ['./build/', '0', '.', '22025039', '.chunk.js'] as matches in my first filename AST.

When I do a glob match on my PR branch, I get ['./build/', '0', '.', 'e2984a0e', '.chunk.js'] as matches.

When I compare the two, I get a partial match on ['./build/', '0', '.', '.chunk.js']! And since the glob does not match any other filenames, I will assume that this is the same file on master, just with a different hash.

'./build/*.chunk.js'

For this example I would match ['./build/', '0.22025039', '.chunk.js'] and ['./build/', '0.e2984a0e', '.chunk.js'] on master and my PR branch respectively.

This would result in a partial match of ['./build/', '.chunk.js'] and because there are two filenames on master (0.22025039.chunk.js and 1.17d36c31.chunk.js) that partially match these groups, the mapping to master fails and we print the debug message Unable to map filename "./build/0.e2984a0e.chunk.js" to master with glob "./build/*.chunk.js"!.

If this is a feature request, what is motivation or use case for changing the behavior?

The bundle size analysis to ensure that a chunk is not greater than a given size is nice, but the feature I really love is the comparison to master to see how the bundle size has changed, and it stinks that since I use the [chunkhash] in the filename I am unable to utilize this killer feature.

I realize that this is a fairly big change, but I would be happy to implement it and put the glob parsing/mapping behavior behind a configuration flag if necessary. I just think this will make this tool that much more flexible and usable in more projects.

Please mention other relevant information.

TimonVS commented 5 years ago

Any update on this? I'd love to be able to use this with my create-react-app based application :)