elazarl / goproxy

An HTTP proxy library for Go
BSD 3-Clause "New" or "Revised" License
5.89k stars 1.07k forks source link

Avoid creating bad URLs when a mitm request has the wrong scheme #528

Open mctofu opened 3 months ago

mctofu commented 3 months ago

If a mitm request came in like http://host/path this was getting mangled to https://hosthttp://host/path. Instead, we should just swap the scheme to https if the scheme is already defined in the url.

elazarl commented 3 months ago

Can you describe the bug that this fixes?

mctofu commented 3 months ago

We're still investigating this & working on a test but have been observing that the npm cli is not always working with the proxy in mitm mode.

When running npm install npm@10.5.0 on a fresh project we've seen these requests to the proxy:

CONNECT registry.npmjs.org:443 HTTP/1.1
GET http://registry.npmjs.org:443/npm HTTP/1.1

This proxy then detects the url does not match the https regex and transforms it to https://registry.npmjs.org:443http://registry.npmjs.org:443/npm. This eventually fails dns resolution later on and closes the connection without returning an error.

Other requests from the npm cli use https instead of http so this doesn't seem to be consistent and we haven't looked to see what the cause is yet.

I'm not sure what the right behavior is in this case. I've observed that mitmproxy v8.1.1 seems to be allowing this by converting the url to https.

elazarl commented 3 months ago

but it shouldn't use GET http://... should it? If it really uses goproxy as a standard HTTP proxy

mctofu commented 3 months ago

but it shouldn't use GET http://... should it? If it really uses goproxy as a standard HTTP proxy

It does seem strange but I'm not positive it's not allowed. I found this request does work successfully when I configure npm to use a squid proxy without mitm handling so registry.npmjs.org is not rejecting the request.

jeffwidman commented 2 months ago

@mctofu any suggestions on next steps here?

mctofu commented 2 months ago

@jeffwidman I think we need to add a test case for this before undrafting.

We could also followup with the npm team on this behavior. I have a vague memory of running into this before.