mswjs / interceptors

Low-level network interception library.
https://npm.im/@mswjs/interceptors
MIT License
570 stars 128 forks source link

Docs: Socket-based interceptor #399

Closed kettanaito closed 5 months ago

kettanaito commented 1 year ago

Socket Interception

ideas

Over the years, I sat down to write Socket-based interceptor at least half a dozen times. It never ended fruitfully. This time, I decided to at least document my findings and struggles so I don’t spend so much time finding my way in the same water once more.

Intention

The intention here is very simple: the network in Node.js goes through net.Socket, no matter who and how initiates it. http.ClientRequest constructs a Socket instance, fetch (Undici) constructs a Socket instance, and even bi-directional communication like WebSockets is also implemented through Socket. Tapping into net.Socket directly would mean even less request-specific code.

Obstacles

net.Socket is omitted sometimes

Contradictory to my statement in the Intention, the call to net.Socket, net.connect() or net.createConnection() is not always performed when a request is made in Node.js. For example, the very default http.get() call will not result in either of those functions being called. There’s quite a chunk of internal logic in Node.js around creating and managing sockets (hence the Agent), and I can only speculate that it doesn’t create a new socket but it depends on the already created one, somehow.

Socket is too low-level

Socket handles Buffer sent/received over the network. We are talking about raw HTTP messages and raw Readable streams. Reading those isn’t a problem, recreating those is.

Re-creating that, eventually, becomes a necessity because if you extend net.Socket and emulate a successful connection to a host (since the actual connection would error, the host is mocked), you end up managing the entire socket’s life-cycle by yourself.

class Socket extends net.Socket {
  connnect(...args) {
    // This is enough to pretend that the Socket instance
    // has looked up, resolved, and connected to a host.
    this.connecting = true
    this.emit('lookup', '0.0.0.0', 'IPv6', 'hostname')
    this.emit('connect', false)
    this.connecting = false
    return this
  }
}

There are more parts in the socket that depend on its state. For instance, you have to have the this._handle set, which is a reference to a TCP or Pipe wraps responsible for piping chunks through the socket.

To make things worse, those handles are internal in Node.js and you cannot construct them by yourself. Reference.

You end up implementing things like handle.writeLatin1String, handle.readStart(), handle.close(), and so forth. Honestly, this is too much.

Good things

On the bright side, or, rather, the side I’ve never got to in order to prove it’s even worse, reading/writing chunks is a matter of manipulating the socket’s stream. The worst part about it is having to translate the HTTP request message to some consumable format, like Request, and translate the mocked Response to the HTTP response message. None of those are deal-breaking.

Node.js doesn’t expose its HTTP message parser. It’s present on ClientRequest and the socket instances but you cannot operate with it easily.

Another good thing is that socket observer is trivial. It can be achieved with revokable Proxies spying at .write and .on(‘data') methods/events. But observer alone is not enough to fulfill the interception requirements.

Where to go from here

Socket-based interception is, technically, possible. But it’s a big question whether it’s worth the price. Maintaining such an interceptor would require a deep knowledge of Node.js internals and keeping track of whenever those change across Node.js releases.

We may require to revisit this in the light of WebSockets support, but given how there’s no standard protocol for WebSockets, and the most popular solutions implement it over HTTP/XHR polling anyway, perhaps Socket-based interception won’t be needed entirely.

kettanaito commented 1 year ago

Socket is also sensitive to import order. Since most of developers don't import it directly, the Interceptor must be imported before any of the third-parties. This is problematic to achieve sometimes.

kettanaito commented 5 months ago

Released: v0.32.0 🎉

This has been released in v0.32.0!

Make sure to always update to the latest version (npm i @mswjs/interceptors@latest) to get the newest features and bug fixes.


Predictable release automation by @ossjs/release.