Open jcarlsonautomatiq opened 1 year ago
Hey, @jcarlsonautomatiq. Thanks for reporting this.
Not sure what's going off here. The path
option in ClientRequest
must always be a string starting with /
. It appears that when using hpagent
, it's not the case. This may as well be one of the many real-world use cases that are tricky and have to be handled in getUrlByRequestOptions
.
We need to add a test for this.
@jcarlsonautomatiq can you please share how you intend to mock those requests?
Your original example also has a mistake: got
cannot accept that proxy agent as the value of agents.https
. Doing so produces both TypeScript and runtime errors.
TypeScript:
No overload matches this call.
The last overload gave the following error.
Property 'options' is missing in type 'HttpProxyAgent' but required in type 'Agent'.ts(2769)
https.d.ts(31, 9): 'options' is declared here.
Runtime:
RequestError: Protocol "https:" not supported. Expected "http:"
❯ Request._makeRequest ../node_modules/.pnpm/got@11.8.6/node_modules/got/dist/source/core/index.js:1191:19
❯ new ClientRequest ../node:_http_client:189:11
I can help with fixing this but I need an exact example that is complete and working (well, failing, in our case).
This code is currently being migrated from raw working but untested JS to Typescript which was one of the reasons I was actually adding the unit tests. I wonder if there is an update in got as well we are on "got-cjs": "^12.0.1" and it doesn't cause a protocol error. I will see about stripping this down to remove all the company stuff.
As for testing it I was hoping I would be able to intercept the proxy url and then look at the original request to determine what to return (or just grab a ?url out of the call). A little hacky but it would let me use the proxy calls and ensure I didn't break anything in the migration. I will hack up a proper working snippet this morning.
The issue with the path originates from here: https://github.com/delvedor/hpagent/blob/96f45f1d40bfbdfd0fcc84cdba056be6e0fb8f4c/index.js#L23
I'm not sure what's the intention here by providing the following request options to create a request:
[
{
method: 'CONNECT',
host: 'fake.proxy',
port: '443',
path: 'fake.ping:80',
setHost: false,
headers: {
connection: 'keep-alive',
host: 'fake.ping:80',
'proxy-authorization': 'Basic cHJveHlVc2VybmFtZTpwcm94eVBhc3N3b3Jk'
},
agent: false,
timeout: 0
}
]
What's the intended end request URL here, I wonder?
I can assume it's https://fake.proxy/fake.ping:80
. Once I do that, we get the next error:
RequestError: 'CONNECT' HTTP method is unsupported.
And this one is far more interesting. See, the Fetch API specification forbids creating requests with CONNECT
method:
new Request('/resource', { method: 'CONNECT' })
This fails standalone.
The code is using some long lived proxies from Nimble. I expect that might be some of the strange connection issue but that is definitely odd. And apologies I should have had the fully working example in place initially.
import { HttpsProxyAgent } from 'hpagent'; // version: <edit>. Updated to 1.2.0 and it still will repro.
import { got } from 'got-cjs'; // version: 12.5.4
const host = 'fake.proxy';
const username = "proxyUsername";
const password = "proxyPassword";
const port = '443';
let proxyUrl = `http://${username}:${password}@${host}:${port}`;
let proxy = new HttpsProxyAgent({
keepAlive: true,
keepAliveMsecs: 1000,
maxSockets: 256,
maxFreeSockets: 256,
scheduling: 'lifo',
proxy: proxyUrl,
});
// Then used in a got call like this:
const PING_URL = "https://fake.ping/pinging";
await got.get({
url: PING_URL,
throwHttpErrors: false, // No idea the code thought it should do this?
agent: { https: proxy},
https: { rejectUnauthorized: false },
});
It seems like if the invalid URL error is fixed this code test will probably have to be moved to the browser version of the MSW lib and only tested e2e which should be fine. Plus eventually I will run into the same agent and type conflicts :)
I will look more at the hpagent code that it might need a note about not being used in Node directly. Or forcing a custom http agent when used with a keepAlive connection outside a browser.
https://stackoverflow.com/questions/62500011/reuse-tcp-connection-with-node-fetch-in-node-js
Okay, so learned a lot from the Node.js docs, Got, and hpagent
here. Sharing below.
Turns out, when performing the CONNECT
method in ClientRequest
, the path
option is expected to point to the target host:
const options = {
port: 1337,
host: '127.0.0.1',
method: 'CONNECT',
path: 'www.google.com:80',
}
const req = http.request(options)
Excerpt from the example in Node.js docs
Moreover, when the server receives that request, the req.url
already equals to www.google.com:80
. This is some Node.js server magic I presume, but Interceptors must do the same thing (req.method
remains CONNECT
).
This is what happens to the request in a simplified proxy setup (following the same Node.js example):
CONNECT
method and <target>
path
.connect
event where req.method
is CONNECT
and req.url
is <target>
.req.url
.200 Connection <target>
HTTP message (direct request.write()
). head
from the connect
event to the request also. connect
event with three arguments:
res
, IncomingMessage
. This is the `200 Connection socket
, net.Socket
, the socket of the new connection to the <target>
. head
, Buffer
, whatever head
server has written in its connect
event.In the connect
request callback, you can then issue new requests by writing directly onto the target socket
and receiving the response from the target via socket.on('data')
.
I assume that last part is handled by hpagent
here because it's rather low-level.
Basically, we don't support this message exchange right now. This is a new feature request and I will prioritize it accordingly. Will push the WIP pull request I have right now with the failing test and some findings. Thanks for making this apparent, @jcarlsonautomatiq!
I'm not sure as of now how to properly support this. Since the connect
event on ClientRequest
exposes an extraneous Socket
, it means we have to support the socket-level messaging in Interceptors (see #399). This isn't that easy and I'd rather avoid it if we can.
Thanks for putting all the information in here it makes it a lot easier to think it through! If I might recommend a possible test framework solution I think just about everyone would accept it might go like this (I certainly would):
MSW is a great library and I have used it successfully in other "got" calls but in one part of my unit testing I was actually checking that a proxy call would actually try and use the proxy. Unfortunately I think the way that the path and proxy information is generated results in an invalid URL which bails before you can get a matcher to handle it. The proxy host is correctly assigned but then the path(the actual url we call) is just directly concatenated as a path.
Code that pretty much reproduces it.