OverZealous / cdnizer

Node module for replacing local links with CDN links, includes fallbacks and customization
MIT License
52 stars 24 forks source link

Strict regexes #40

Closed jgonggrijp closed 2 years ago

jgonggrijp commented 2 years ago

This fixes #29 by making the regular expressions in lib/util.js mutually exclusive and more accurate. By "more accurate", I mean that the patterns represent the HTML syntax better, allowing more in some places and less in other places.

I have double-checked that all tests still pass. Nevertheless, the changes to the regular expressions should be scrutinized closely. For this purpose, it is easiest to look at https://github.com/OverZealous/cdnizer/commit/338868bcc13160fb9cca80c3b46c11d3368664f4 in isolation.

The changes in the package-lock.json can be ignored, I only upgraded it to version 2.

Optionally I could add another commit to enforce that the closing quote of the URL is identical to the opening quote (i.e., both double or both single). This would require a named capturing group: (?<quote>["']) at opening and \k<quote> elsewhere in the pattern.

I have not looked into how this change might interact with #20.

OverZealous commented 2 years ago

Thank you for your PR!

I'll be honest, I don't use this library, and don't really have any intention of maintaining it. I'm not really sure I'm comfortable accepting these changes, because I don't know for sure what the side effects will be for existing users.

I think this would actually break previous fixes. [\s\S] was chosen specifically because it matches linebreaks, while matchers like [^'"] won't.

I'm also not sure I like adding all the extra whitespace into the regular expressions. This library is really intended to be run after a minifier, so there should be no reason to have whitespace around the attributes or closing element. Adding that opens the door to overmatching, and I'd prefer the library to remain simpler.

As a final note, before I would accept the PR there are a couple things would need to be done:

  1. Combine the various exploration commits into a single commit.
  2. Remove package.json from the commits, since they are not part of the changes.
  3. Instead of changing the tests, add new ones that also pass. Otherwise, this could be introducing incompatibilities.

(To be totally honest, this library should use an in-memory DOM parser and walk the tree, using regular expressions is a hack that only really works because this is intended for a fairly simplified use case.)

jgonggrijp commented 2 years ago

I'll be honest, I don't use this library, and don't really have any intention of maintaining it. I'm not really sure I'm comfortable accepting these changes, because I don't know for sure what the side effects will be for existing users.

In that case, let's discuss the matter before I put more work into the branch.

I think this would actually break previous fixes. [\s\S] was chosen specifically because it matches linebreaks, while matchers like [^'"] won't.

[^"'] matches everything, including linebreaks, except for quotes. So the only change here is that the URL part is no longer matching quotes, which is the reason why it can now be eager instead of lazy.

I'm also not sure I like adding all the extra whitespace into the regular expressions. This library is really intended to be run after a minifier, so there should be no reason to have whitespace around the attributes or closing element.

Well, I'm using it without a minifier.

Adding that opens the door to overmatching, and I'd prefer the library to remain simpler.

Please explain to me how this could lead to overmatching.

As a final note, before I would accept the PR there are a couple things would need to be done:

  1. Combine the various exploration commits into a single commit.

By my count, there is only one exploration commit: "Experimentally change order in #29 test fixture" (https://github.com/OverZealous/cdnizer/pull/40/commits/c8636880a7f9b33c3b11ad7cd130f11ffb5f0dd7). I don't think squashing that commit into any of the other commits would be a sensible combination, but if you insist I will comply with whatever rebase you have in mind. I can also drop the commit, because its change is not essential. I just left it in because it is informative.

  1. Remove package.json from the commits, since they are not part of the changes.

That's package-lock.json, which I'm just upgrading to a newer version (the version that comes with npm v7 and later). I can drop that commit as well, but the first next contributor who runs npm install will upgrade it as a side effect again.

  1. Instead of changing the tests, add new ones that also pass. Otherwise, this could be introducing incompatibilities.

May I remind you that this is a test that you wrote because of #29, based on my report of the bug, and that the version originally written by you did not actually reproduce the bug, which you were aware of at the time? I only removed the leading slashes in order to reproduce the bug, which I could afford because I was going to fix it.

That said, if you want to retain a copy of the original test with leading slashes, I will comply.

(To be totally honest, this library should use an in-memory DOM parser and walk the tree, using regular expressions is a hack that only really works because this is intended for a fairly simplified use case.)

Granted. We already discussed this in #29.

OverZealous commented 2 years ago

It's been 4 years since #29, so please forgive me if I don't remember the details—there's been a few things that have happened in since then. I didn't remember discussing this at all.

I can just accept your PR as-is, I guess I don't really care that much.

As I said, this isn't a library I care to maintain too much, since I don't use it at all. Please understand that accepting a PR means that I'm implicitly taking responsibility for those code changes. If they break something, then it comes back on me to fix it or reverse the changes. So I tend to be hesitant to accept PRs in general.

OverZealous commented 2 years ago

Published as v3.3.0

jgonggrijp commented 2 years ago

It's been 4 years since #29, so please forgive me if I don't remember the details—there's been a few things that have happened in since then. I didn't remember discussing this at all.

I realized that, that's why I reminded you of it.

I can just accept your PR as-is, I guess I don't really care that much.

As I said, this isn't a library I care to maintain too much, since I don't use it at all.

Out of curiosity: why don't you use it anymore? What has changed in your workflow?

If you no longer wish to maintain cdnizer, perhaps consider transferring the burden to somebody else.

Please understand that accepting a PR means that I'm implicitly taking responsibility for those code changes. If they break something, then it comes back on me to fix it or reverse the changes. So I tend to be hesitant to accept PRs in general.

Being the maintainer of a couple of libraries and tools myself, I understand the responsibility. Thanks for merging and releasing the changes anyway.

If complaints come in because of this change, I am willing to contribute a fix again. Please feel free to mention me.

OverZealous commented 2 years ago

Out of curiosity: why don't you use it anymore? What has changed in your workflow?

I don't like relying on externally served files in production. It's too easy to have something outside of your control go wrong, breaking the app. As long as you can get to my server, everything you need will load. I prefer to include all necessary scripts in my own bundles hosted on my own server. They are hashed and served with a very long cache, so in general the penalty is very low.

Plus with JS modules these days, it's less often I want to serve a full library—better to only import the exact code necessary.

If you no longer wish to maintain cdnizer, perhaps consider transferring the burden to somebody else.

I thought of this. Or at least adding additional maintainers. Maybe at some point.