httptoolkit / mockttp

Powerful friendly HTTP mock server & proxy library
https://httptoolkit.com
Apache License 2.0
774 stars 87 forks source link

Url RegExp match works not as expected with negative RegExp (when I want to proxy all except something) which contains comparison for hostname and for end of path #150

Closed stas-pavlov closed 1 year ago

stas-pavlov commented 1 year ago

Hello,

Your library is extremely useful, thank you!

It looks like RegexPathMatcher works not as expected with negative RegExp (when I want to proxy all except something) which contains comparison for hostname and for end of path

In the file src\rules\matchers.ts in function RegexPathMatcher which implements matches for url RegExp there are following code to match:

const absoluteUrl = normalizeUrl(request.url);
const urlPath = getPathFromAbsoluteUrl(absoluteUrl);

// Test the matcher against both the path alone & the full URL
const urlMatcher = new RegExp(this.regexSource, this.regexFlags);
return urlMatcher.test(absoluteUrl) ||
    urlMatcher.test(urlPath);

Which is Ok if your RegExp expression are positive, like you want to have something in the URL. However, it looks like, if you want to not have something in urls to catch it doesn't work as expected logically.

Example: Here is a negative RegExp ^(?!(.*test.examples.com|.*local.tests.examples2.com))|.*.(js|jpg|png|svg|ico|mp3|wav|json|css)$ so I don't want to proxy static content from specific sites, so I have both sites and extensions conditions in the regexp. As result for that url: http://test.examples.com/not_important_at_all.png?very-important-param=true if I use the regexp above in forGet(myRegExp) or any other methods where regexp can be provided and then check against URL to proceed.

I will have absoluteUrl="http://test.examples.com/not_important_at_all.png" and urlPath="not_important_at_all.png" for the first urlMatcher returns false and for the second it returns ture as the host path from RegExp is missed and we have OR between them:

return urlMatcher.test(absoluteUrl) ||
    urlMatcher.test(urlPath);
stas-pavlov commented 1 year ago

So, I'm curious why we need this second matching check if we have the first one? Isn't the first test enough?

pimterry commented 1 year ago

So, I'm curious why we need this second matching check if we have the first one? Isn't the first test enough?

It's partly UX and partly historical reasons. There are different use cases for Mockttp, and in some you're interested in some URLs but for others you're effectively always handling traffic for a single host, and so matching against the path is what makes sense. It's convenient to be able to cleanly support both, and for 99% of cases it works, but I see how that's problematic here. Unfortunately removing that is obviously backward compatible, and quite inconvenient for those path-centric use cases.

For the non-regex case this is easier to handle - we just check whether the passed URL to match is relative or absolute, and match it accordingly. Unfortunately we can't reliably do that with a regex though.

I would be open to a separate matching method to explicitly match against only the URL though, which should solve your problem. That would give you something like:

mockServer.forGet().withUrlMatching(/^(?!...)/).thenReply(200, ...);

The withUrlMatching method of course would just be tested against the full URL, not the path, and we could possibly later add a withPathMatching() method later if there were demand for that too, and we'd still have the forGet(matcher) for quick easy matching for the common case where it doesn't matter.

Would that work for you? PRs welcome if so.

stas-pavlov commented 1 year ago

Oh, I see.

Yes, this fix with separate matching method to explicitly match against only the URL absolutely works for me! I'm on to create a PR for that.