Open aronson opened 1 year ago
I am still somewhat new to the codebase, so this is a general question and not feedback on your PR (which looks solid btw), but how does this prevent the code from sending messages over the standard socket connection? Judging by the naming and the hookCall source code it seems that hookCall doesn't replace the method, just allows executing additional code before/after it.
Also, hostname might not be the clearest name for the path of the WebSocket connection, since it could be a route on a server (for example, wss://example.com/websocket
).
how does this prevent the code from sending messages
The hookCall
method passes fn, ...args
to its handler(s) which are a function reference and the arguments that would be passed to the function. The method does not call the original fn
itself, it just passes it as a value.
The first argument in the handler is ignored in my implementation. I simply don't call the original method.
However, working on tests I've discovered this approach has one key issue. The original client provides a disconnect()
method that is a void
. It would make the most sense to make disconnect
a Promise<void>
as the websocket.close()
method is a promise. I'm not sure we want to change the public API surface. I wonder if this suggests an entirely new WebSocketClient
is the right approach.
hostname might not be the clearest name for the path of the WebSocket connection
Good catch, I'll make the adjustment.
Awesome PR π
I would not have thought of doing this with a mix of plugin/hooks but I find this way to go very interesting.
I initially thought of a common "transport layer" used by the core client which could be seen as an "adapter" in order to abstract TCP and WebSockets implementations. For instance, the existing code related to TCP would be moved to core/transport/tcp.ts
and the new WebSocket implementation to core/transport/websocket.ts
. Each implementation should implement send / connect (and receive) / disconnect
. The core/client.ts
would be in charge of using the appropriate implementation (whether tcp
or websocket
) when it instantiates its transport layer.
But now I wonder if it would not be more relevant/easier of doing it with plugin. Maybe the WebSocket support should be seen as an extension and not a core feature. π€
how does this prevent the code from sending messages over the standard socket connection? Judging by the naming and the hookCall source code it seems that hookCall doesn't replace the method, just allows executing additional code before/after it
Currently it works like following:
beforeCall
allows to execute some code before the call of a given function (the original function will be called)afterCall
allows to execute some code after the call of a given function (the original function will be called)hookCall
is the base hook function (used by beforeCall
and afterCall
) and gives you the ability to execute some code and lets you call (or not) the original function yourselfJust an update with some thoughts. Working on this again this evening, plan to push some tests, not sure it'll be complete but we'll see. Still need to figure out how to handle the websocket close on disconnect or else the library will leak promises when torn down.
I do think websocket support is best as a pluggable feature given the nature of it being pluggable on the remote side and in the spec in general. It is in an interesting place where it is a transport and not a protocol level feature as mentioned before, but I think architecting it this way makes the most sense (the disconnect: Promise<void>
issue aside). In the IRCv3 spec it's an extension AFAICT, so perhaps best to treat it as one on our side as well from a domain perspective.
I'm going to remove the use of the WebSocketClient/Server library I imported. A colleague showed me how to do it with just Deno standard libraries and the third-party library is proving difficult to mock. I also plan to bring in npm:mock-socket for the testing effort.
It appears removing the 3rd party library has solved the promise issue. websocket.close()
is synchronous in the standard library. How odd...
I think we're ready for review. I added a test scenario covering initial connection and registration with a mock websocket server.
Also snuck in a fix to lower the wait times on the antiflood test to make it execute in a quarter of a second instead of one.
A few questions: 1) Where does port 8067 come from? I think it might be better to just try connecting over 80/443 instead, since WebSockets relies on HTTP(s) to get off the ground. 2) What is client.state.remoteAddr used for? We're kind of put in an awkward position where if the remote is a path instead of a bare hostname and the remoteAddr gets passed to something expecting a hostname, stuff breaks. On the other hand parsing the hostname out of the path is also pretty wrong, since that's not what we're connected to. 3) Also, how come we aren't negotiating a particular protocol (either text.ircv3.net or binary.ircv3.net)?
Thanks both of you for the information on hooks by the way, very helpful!
It appears removing the 3rd party library has solved the promise issue. websocket.close() is synchronous in the standard library. How odd...
Makes sense, the WebSocket API is event-based and not async-await (just like our library actually). I did wonder why you pulled in the third-party dep but figured the reasoning was just that websockets are just awkward to use (full duplex and all that).
I do think websocket support is best as a pluggable feature given the nature of it being pluggable on the remote side and in the spec in general.
Agreed, and this particular one is quite a clever implementation as well.
Where does port 8067 come from?
I saw it in some example configurations for Ergo when researching the protocol. It comes from the typical convention of prepending the HTTP port 80
in front of the last two digits of the original protocol port when wrapping a TCP protocol in a web server transport. Ergo ships websocket support with TLS on 8097
by default.
Reading the protocol doc again, I see the argument to default to 80/443. I'll do that.
What is client.state.remoteAddr used for?
It's only used within the original codebase inside the original TCP client, its test, and for printing the "server hostname" (in this case now "server URL"). This state is exposed on the public API surface so renaming it may be undesirable. We could remove its use entirely in the websocket client, or we could indeed parse out the hostname and expose a new property path?: string
to fill the gap. Would that approach work for you?
I think we can rely on consumers of this library who have a need to programmatically access the specific state that they passed themselves to the client to consume a new property when upgrading the transport protocol of their codebase.
how come we aren't negotiating a particular protocol
I found with the existing server implementation I tested with we didn't have to -- Ergo at least assumes the protocol from our first payload and we're sending it in the MUST implement binary format. The spec guides servers to support clients that don't implement the negotiation.
However, let me add this negotiation in with the option to support text
as well. That's pretty important for maximum compat.
EDIT: I'm also going to test against all three server implementations once that's complete.
I'll be able to make these changes later today.
Seems I was mistaken, I did not implement the protocol correctly. The binary protocol is now implemented as well, and negotiated first.
I tested against all three server implementations through discord-irc confirming messages are passed and parsed.
I found this excellent IRC websocket tester site that includes minimal code examples as well as test URLs to register with implementations of these IRC servers using websockets.
The minimal example fails as well...
function delay(ms = 0) {
return new Promise((resolve) => setTimeout(resolve, ms));
}
let socket = new WebSocket('wss://testnet.inspircd.org:8097', ['binary.ircv3.net', 'text.ircv3.net']);
socket.addEventListener('error', error => {
console.log(error);
});
socket.addEventListener('open', () => {
try {
socket.send('NICK Example9999');
socket.send('USER 0 0 0 0');
} catch (e) {
console.log(e);
}
});
socket.addEventListener('message', async e => {
var line = e.data;
if (line instanceof Blob) {
line = await e.data.text();
}
console.log(line);
})
await delay(10000);
deno run -A test.ts
ErrorEvent {
bubbles: false,
cancelable: false,
composed: false,
currentTarget: WebSocket {
url: "wss://testnet.inspircd.org:8097/",
readyState: 3,
extensions: "",
protocol: "",
binaryType: "blob",
bufferedAmount: 0
},
defaultPrevented: false,
eventPhase: 2,
srcElement: null,
target: WebSocket {
url: "wss://testnet.inspircd.org:8097/",
readyState: 3,
extensions: "",
protocol: "",
binaryType: "blob",
bufferedAmount: 0
},
returnValue: true,
timeStamp: 0,
type: "error",
message: "NetworkError: failed to connect to WebSocket: Invalid status code",
filename: "",
lineno: 0,
colno: 0,
error: DOMException: failed to connect to WebSocket: Invalid status code
}
Not sure what to make of this. Perhaps we open an issue upstream?
It looks like an issue with the WebSocket implementation in Deno?
I'll have to wireshark the HTTP interaction during the websocket upgrade, but I'm inclined to think this is a Deno correctness feature where invalid upgrade attempts are marked as invalid, whereas Node would just allow them anyways.
This might be an InspIRCd bug if the upgrade exchange proves to indeed send the wrong status code. Don't have enough evidence yet to know for sure.
The InspIRCd server is returning a 400 on the HTTP GET request. This log is my failed connection to an InspIRCd test server, and then a successful connection to my local Ergo server.
This is the details of the request Deno sends (node sends a similar request):
Hypertext Transfer Protocol
GET / HTTP/1.1\r\n
[Expert Info (Chat/Sequence): GET / HTTP/1.1\r\n]
Request Method: GET
Request URI: /
Request Version: HTTP/1.1
user-agent: Deno/1.37.2\r\n
host: testnet.inspircd.org:8067\r\n
upgrade: websocket\r\n
connection: Upgrade\r\n
sec-websocket-key: hzrVmFpj9t85Wv8/sNcGeg==\r\n
sec-websocket-version: 13\r\n
sec-websocket-protocol: binary.ircv3.net, text.ircv3.net\r\n
\r\n
[Full request URI: http://testnet.inspircd.org:8067/]
[HTTP request 1/1]
[Response in frame: 63044]
I tested with a minimal node.js example and captured the traffic.... and it's the same error with my insecure InspIRCd test server.
const WebSocket = require('ws');
function delay(ms = 0) {
return new Promise((resolve) => setTimeout(resolve, ms));
}
const socket = new WebSocket('ws://[REDACTED]:7002', ['text.ircv3.net', 'binary.ircv3.net']);
socket.addEventListener('error', error => {
console.log(error);
});
socket.addEventListener('open', () => {
try {
socket.send('NICK Example2000');
socket.send('USER 0 0 0 0');
} catch (e) {
console.log(e);
}
});
socket.addEventListener('message', async e => {
var line = e.data;
if (line instanceof Blob) {
line = await e.data.text();
}
console.log(line);
});
(async () => {
await delay(10000);
})();
Looks like I'm opening a bug upstream with InspIRCd unless the development version I'm compiling now fixes it. For our purposes I think it's OK to say we're in the clear when even the minimal example under node fails. Looks like this specific implementation only works from browsers right now.
EDIT: Issue present on master branch of InspIRCd as well.
Thank you for taking a look π We'll merge this PR soon if it seems OK for you.
The fact that the same example with Node.js fails tends to confirm that the cause comes from the ircd. π
I'm OK with merging this now after your review. I've addressed some issues in the original implementation and we have solid tests.
I added some unit tests for the text-protocol-only case as well as testing some of the error handlers and internal state cleanup code. I got it to 81.818% (81/99 lines) coverage. All that's left are catch blocks I'm having trouble getting to trip.
One last-minute update before review. I had an oversight in testing websocket connections that have a path. I've added an optional parameter to connect
to handle paths and updated the documentation, as well as updated the tests to ensure the path logic is covered.
InspIRCd issue is tracked here
Update after talking with the maintainer of InspIRCd:
This "issue" in InspIRCd is not really an issue per se, it's a logical misuse of the websocket feature. Websocket support is intended for browser environments. We're (Deno, really) not sending an Origin: ${someUrl}
header because it doesn't make sense to do so in a desktop/server application. Websocket support is intended for webapps that IRC server operators control. InspIRCd is doing the "correct" thing and rejecting the request as it can't validate the origin of the request for anti-spam protection.
So, if we ran this library in a web browser, the connection to InspIRCd would work (or be rejected by the server configuration) as the Origin
header would be set by the framework. Nothing to be concerned about.
I added some unit tests for the text-protocol-only case as well as testing some of the error handlers and internal state cleanup code. I got it to 81.818% (81/99 lines) coverage. All that's left are catch blocks I'm having trouble getting to trip.
Some parts are hard to test and don't add much value. Nominal case is the most important for now π
Update after talking with the maintainer of InspIRCd:
This "issue" in InspIRCd is not really an issue per se, it's a logical misuse of the websocket feature. Websocket support is intended for browser environments. We're (Deno, really) not sending an
Origin: ${someUrl}
header because it doesn't make sense to do so in a desktop/server application. Websocket support is intended for webapps that IRC server operators control. InspIRCd is doing the "correct" thing and rejecting the request as it can't validate the origin of the request for anti-spam protection.So, if we ran this library in a web browser, the connection to InspIRCd would work (or be rejected by the server configuration) as the
Origin
header would be set by the framework. Nothing to be concerned about.
I read the issue so the good place for WebSocket support is indeed in a plugin.
Although the final purpose of it is to work in a browser (not in Node.js or Deno), it's great to have it for our lib.
I have a little API suggestion about connect()
arguments:
async connect(
hostname: string,
port = PORT,
tls = false,
_path?: string,
): Promise<Deno.Conn | null>
client.connect("irc.example.org", 8097, true, "pathTo/Irc");
I should have change it before, but now that we add extra argument, I wonder if all optional arguments could not be passed in an object option:
type ConnectOptions = {
port?: number
tls?: boolean;
path?: string;
}
async connect(
hostname: string,
options: ConnectOptions = {}
): Promise<Deno.Conn | null> {
options.port ??= PORT
options.tls ??= false
...
}
// usages
client.connect("irc.example.org", { port: 8097, tls: true });
client.connect("irc.example.org", { path: "pathTo/Irc" });
WDYT? It would be more API conform to Deno best practices according to https://docs.deno.com/runtime/manual/references/contributing/style_guide#exported-functions-max-2-args-put-the-rest-into-an-options-object
Not sure about including optional port since we often change it (especially when SSL is enabled).
I think that's a solid change. If we're going to change the public API surface we should use this ConnectOptions
pattern.
Reading that section of your link, I agree with the guideline.
Not sure about including optional port
I think it's best for the port to be required. The port's default state would otherwise also rely on the potentially undefined state of tls
and it gets even more complicated as we get into websocket support where there are two sets of generally accepted defaults and we'd have to pick a side.
This might suggest port goes outside of ConnectOptions
as the second (required) parameter to the connect
function. What're your thoughts on that? I could see it either way there.
I think that's a solid change. If we're going to change the public API surface we should use this
ConnectOptions
pattern.Reading that section of your link, I agree with the guideline.
Not sure about including optional port
I think it's best for the port to be required. The port's default state would otherwise also rely on the potentially undefined state of
tls
and it gets even more complicated as we get into websocket support where there are two sets of generally accepted defaults and we'd have to pick a side. This might suggest port goes outside ofConnectOptions
as the second (required) parameter to theconnect
function. What're your thoughts on that? I could see it either way there.
I think you are right. Due to the multiple choices we have now (TCP, secure TCP, WebSockets, ...), maybe having the port mandatory would be a good thing in order to make it more explicit.
Otherwise I initially thought of having these usages with some predefined ports:
const client = new Client({ ... }); // websocket: false
client.connect("irc.example.org"); // use default port 6667
client.connect("irc.example.org", { tls: true }); // use default port 6697
client.connect("irc.example.org", { tls: true, port: 7000 }); // use custom port
const client = new Client({ websocket: true });
client.connect("irc.example.org", { path: "pathTo/Irc" }); // use default port 80
client.connect("irc.example.org", { path: "pathTo/Irc", tls: true }); // use default port 443
client.connect("irc.example.org", { path: "pathTo/Irc", tls: true, port: 8097 }); // use custom port
But, especially for WebSocket cases, making the port optional could probably cause unexpected errors.
Sorry for the delay in response and my lack of involvement with this feature in general, been kinda slammed this week with life stuff.
This "issue" in InspIRCd is not really an issue per se, it's a logical misuse of the websocket feature. Websocket support is intended for browser environments. We're (Deno, really) not sending an Origin: ${someUrl} header because it doesn't make sense to do so in a desktop/server application. Websocket support is intended for webapps that IRC server operators control. InspIRCd is doing the "correct" thing and rejecting the request as it can't validate the origin of the request for anti-spam protection.
Honestly it sounds like a bug for even farther upstream. If Deno promises browser environment compat (screenshot from their website), then they should either allow configuring Origin or (worse way imo) inferring and sending a value.
Re: the WebSocket port number, I'd love if we could infer it, but the jankiness with 80/443/8067/8097/ something else entirely sways me toward just making it a required param. Maybe we could explain it in the docs or print a nice error with some values for ports (although I think anyone mucking about this deep on the IRC iceberg already has an idea what port number to use).
I added some unit tests for the text-protocol-only case as well as testing some of the error handlers and internal state cleanup code. I got it to 81.818% (81/99 lines) coverage.
Offtopic, but how do you measure codecov? Do you just calculate it manually?
how do you measure codecov
make test-coverage
π
how do you measure codecov
make test-coverage
π
Okay, now I feel stupid.
Re: the WebSocket port number, I'd love if we could infer it, but the jankiness with 80/443/8067/8097/ something else entirely sways me toward just making it a required param. Maybe we could explain it in the docs or print a nice error with some values for ports (although I think anyone mucking about this deep on the IRC iceberg already has an idea what port number to use).
On second thought, from an ergonomic standpoint, I don't know if it'd be the best thing. We would create an API difference between WebSockets and TCP transports which is not strictly necessary (when I say strictly necessary, I mean the websocket: true
arg to Client().
To be fair this is a pretty contrived usecase, but if I were working on TCP and then realised I needed to be using WebSockets to connect, I'd expect to add websocket: true
to the constructor and just have it magically work.
But then again, there's no sane default port number for IRC-over-WebSockets. Is this a bug in the spec? :3
Don't worry about any delay, we're all volunteers just trying to make this better. I don't think there's any rush!
it sounds like a bug for even farther upstream
This may be true. I was somewhat astonished we couldn't set the Origin ourselves without re-implementing websockets from the ground up. I will draft up something to put on their tracker. We'll see what upstream thinks.
anyone mucking about this deep on the IRC iceberg already has an idea what port number to use
My thoughts exactly. Either they've set up the server themselves, or they're likely already active on the server they want to connect to and therefore have already figured out the details. I don't think anyone is using this library in a vacuum.
When I'm free later today I'll make the port required as the second parameter and implement ConnectionOptions
.
I'd expect to add websocket: true to the constructor and just have it magically work
I don't think this is a reasonable expectation. Users may carry this expectation, but it's not aligned with the reality that there is no solid, standard port for websocket connections for IRC. There is the HTTP(s) standard and the 8000+(67/97) standard. Which standard do they expect would work? Whichever we decide on, we're technically wrong, or may mislead a user. I think it's better to be agnostic of port and make it required. This also helps guide the user towards the path
parameter as it's adjacent in the ConnectionOptions
, which may be relevant for the "I want to switch from TCP to websockets" usecase.
Keep in mind, as the InspIRCd maintainer mentioned in the issue I linked,
Websocket support is intended for web applications (which aways send an origin header) controlled by the IRC network admins.
IRC network admins should know exactly which port they need.
... there is no solid, standard port for websocket connections for IRC. There is the HTTP(s) standard and the 8000+(67/97) standard. Which standard do they expect would work? Whichever we decide on, we're technically wrong, or may mislead a user.
I'm gonna ask in the ircv3 channel their thoughts on adding a "suggested" port number to the spec. The core IRC spec mentions the standard TCP port number, so why shouldn't the WebSocket extension?
This may be true. I was somewhat astonished we couldn't set the Origin ourselves without re-implementing websockets from the ground up. I will draft up something to put on their tracker. We'll see what upstream thinks.
I could handle that, if you'd like.
Sorry for the comment spam, but I have more thoughts to add and I'm not sure what the better etiquette is here, add a new comment or edit an existing one.
But anyway, I see why Deno maintainers may have done this - Origin (along with many other headers) is marked as forbidden, and the browser WebSocket API doesn't allow setting it. So I see why they don't let us set it, but at the same time, it breaks web applications that expect an Origin header in the WS upgrade request. They're kind of stuck between a rock and a hard place.
Similarly, clients may wish to support legacy servers that are unaware of subprotocols. The standard JavaScript API for WebSockets will fail the WebSocket connection if none of the clientβs desired subprotocols could be negotiated. In this case, the client could attempt another connection without subprotocol negotiation.
Also, do we want to bother with this portion of the spec? This is from the non-normative section.
I'm gonna ask in the ircv3 channel their thoughts on adding a "suggested" port number to the spec. The core IRC spec mentions the standard TCP port number, so why shouldn't the WebSocket extension?
<shan> Hi all, had some questions on the WebSocket spec
<shan> (i understand that it's incomplete, but was wondering if there are any plans to edit it with these things)
<shan> chiefly, i just have one question - why no standard-defined port number?
<Valerie> websocket standard port number is 8000 I think
at least, that's what it seems to default to in javascript
<emersion> nah, it's same as HTTP(S)
<Valerie> oh, strange :D
<emersion> WebSocket works by hijacking a regular HTTP request
with an Upgrade header field
β¦ hence the name
<Valerie> :D
anyway, /should/ the spec give a defined port?
or is it entirely up to the admin who implements it?
<spb> half the point of websockets is that they go over http connections
<emersion> i don't think it makes sense for a spec using WebSocket to re-assert what the default HTTP port is
<shan> Valerie + emersion: yep, it's 80/443 depending on ws or wss. But there's the tradition of 80 (the http port) + whatever the tcp port is for that thing, so some IRC server defaults use 8067 or 8097 for irc-over-ws (Ergo)
I'm just saying, since the core IRC spec specifies a standard port number, it makes sense that the WS extension for IRC also specify a port number
<Noisytoot> 80 + 6667 is 6747, not 8067
<shan> (JavaScript dev here, implicit string concatenation :P)
<spb> thank you for that captain obvious
<shan> in my mind it makes more sense to use 80/443 since 8067/8097 seem kinda arbitrary
so if the spec could clarify that it'd be cool
<emersion> the core IRC spec is over TCP, so makes sense to define a port there
but not for WebSockets
<Sadie> It doesnβt really matter what port is used for websockets because networks are going to have to manually configure their clients anyway
<shan> gotcha, fair enough
<Sadie> you probably want to have stuff on standard http ports so stuff doesn't get blocked by corporate firewalls though
<shan> But that's not a _spec_ problem, that's a _humans_ problem
<emersion> the non-80/443 ports are mainly used for dev and for setups with reverse proxies
just like any other HTTP server really
<shan> k, context here is that i'm contributing to an irc library and we're having trouble deciding if we should ship a "default" irc port
when connecting over WS
<emersion> afaik, you don't need to take such decision
<Sadie> probably not
<emersion> just use "wss://example.org/" and the browser will decide for you
K, so as far as the spec goes, it's pretty much a WONTFIX as far as I can tell, which is fair on their part. However, a lot of people do seem to agree that 80/443 are the "expected" behaviour.
I say we do what emersion says and let the WS implementation in deno handle the port number based on what protocol we pass, and only include the port in the connection URL if it's supplied. I guess that presents an issue for the remoteAddr side of things though.
it sounds like a bug for even farther upstream
This may be true. I was somewhat astonished we couldn't set the Origin ourselves without re-implementing websockets from the ground up. I will draft up something to put on their tracker. We'll see what upstream thinks.
Apparently there's a headers
option in the unstable WebSocketStream API which allows configuring Origin.
Relevant docs: WebSocketStreamOptions and WebSocketStream.
It seems pretty barebones though.
I think they're approaching this from a different angle than we are.
Typically one provides a URL for a websocket connection. The URL would imply or provide the port. However, in our case, we are not passing/passed the URL. We pass/are passed the components of the URL in the public API surface, which means it must be assembled. Therefore none of the guarantees built into the websocket URL format exist here. There is no "expected behavior" other than what's implied by the API surface and documented IMHO. And we can comment the API surface to make it obvious in developers' IDEs. It's not going to be obvious to everyone that the hostname and path are assembled into a websocket URL internally -- which is where the default port would come from -- nor should they have to care IMO.
While I now think port 80/443 are acceptable defaults, I don't think it should be optional. I think I'd want a developer who is trying to make a connection to IRC in their first project with this library to get an error when they make a call like connect("somehostname")
so they discover the other parameters of the function naturally. Hiding this from them just hides discovery of relevant options. I was that developer a few months ago in discord-irc switching to this library from another that had a slightly different API surface. It would've made it easy to discover where exactly I should put the port I'm given from user configuration.
The alternate situation is a developer who read the documentation, where we can have a section explaining which ports are typical π I like the examples Jerome wrote for that.
Also, do we want to bother with this portion of the spec? This is from the non-normative section.
I don't think I do, unless we can find such a legacy server to test against. The three existing implementations are what will be used by a majority of servers I'm sure. Other servers aren't developed yet or are still in development AFAIK, so when they do get developed they'll be aware of subprotocols. If we did implement this, we'd have to detect if it's binary or text protocol and I'm not sure the best way to do that.
If you want to add a commit implementing this go for it!
Apparently there's a headers option in the unstable WebSocketStream API which allows configuring Origin.
π have all our problems been solved? It's unstable though, I don't think we should include it just yet. Perhaps a future enhancement.
Hello, I bring an update discussed in issue #12 It connects to IRC servers through the IRCv3 WebSocket specification.
This PR is a WIP and is not in a ready state. It still needs unit tests and documentation.
Tested working:
This PR has been tested with a local Ergo IRCv3 server and discord-irc running locally and all features work as expected.