ircv3 / ircv3-specifications

IRCv3 specifications | Roadmap: https://git.io/IRCv3-Roadmap | Code of conduct: http://ircv3.net/conduct.html
http://ircv3.net/
778 stars 79 forks source link

ratify CHATHISTORY #437

Open slingamn opened 3 years ago

slingamn commented 3 years ago

Current implementations:

Servers:

Clients:

Bouncers (as server):

Networks:

  1. https://testnet.oragono.io

Where to: I'd love to see this in a desktop and a mobile client. It would also be useful to have this supported by a server implementation that currently buffers history for autoreplay on channel join: requesting the chathistory cap would suppress the auto-playback, and then CHATHISTORY queries could be answered from the buffer.

emersion commented 3 years ago

It would also be useful to have this supported by a server implementation that currently buffers history for autoreplay on channel join: requesting the chathistory cap would suppress the auto-playback, and then CHATHISTORY queries could be answered from the buffer.

FWIW, soju does exactly this. It's a bouncer though.

slingamn commented 3 years ago

@kylef notes an incompatibility with the chathistory batch spec, which seemingly forbids the use of the * target:

This parameter contains the target of messages within the batch. The target MUST either be the nick of the remote client for private messages or the name of a channel which the local client is in for public messages.

So if possible, we should amend that spec?

jwheare commented 3 years ago

A couple of questions that remain unanswered from the initial draft PR discussion (https://github.com/ircv3/ircv3-specifications/pull/393#issuecomment-721265459)

slingamn commented 3 years ago

I had some skepticism about this rationale:

There's no risk of TAGMSG causing a state related desync, they are by design lossy, so can't communicate important consistent state.

How is it explicit in the design of TAGMSG that they are lossy? For example, naive implementations of +draft/react would cause a local state change.

As for the name --- what would be the rationale for changing it? To clarify that it's a sub-capability of chathistory?

jwheare commented 3 years ago

They’re lossy because only some clients see them. And they carry no more state than a privmsg. It’s not like a quit or join that affects the channel member list.

jwheare commented 3 years ago

The rationale for changing the events is that the original one doesn’t match the base name. It’s not chat-playback. draft stage means we can change stuff to be better before it becomes permanent.

slingamn commented 3 years ago

I don't have very strong feelings about this either way. I might wait for other people to opine?

emersion commented 3 years ago

IMHO it makes sense to add the prefix, it makes it clearer that the cap belongs to chathistory. Maybe we'll have another spec in the future where "event playback" could mean something else.

slingamn commented 3 years ago

re. amending the chathistory batch type, I forgot our actual "solution" to this: for the * target, we send the user's own nickname as the batch parameter. I can still see a case for amending the other spec to allow * there.

slingamn commented 3 years ago

From discussion, it should be OK for the draft name to remain draft/event-playback, and then for the ratified name to be chathistory-events.

slingamn commented 3 years ago

This is additionally blocked on https://github.com/ircv3/ircv3-specifications/pull/450. I plan to take that PR out of draft in a few days, once Oragono 2.6.0 is published.

syzop commented 3 years ago

I have implemented CHATHISTORY in UnrealIRCd according to the current draft specification, it will be in UnrealIRCd 5.2.0. As for the CAPs: it implements draft/chathistory but not draft/event-playback. I could live with pretty much every aspect of the specification so nice work on that! :+1: I think I hit 1 or 2 small issues, maybe I should dig up my notes on that but.. they may not even be worth mentioning.

irc2.unrealircd.org is online with the current implementation (running UnrealIRCd from git) and can be used by clients for testing. Things are undertested atm as I still need to add CHATHISTORY to our test framework. There's 1 small "known issue" with CHATHISTORY TARGETS not returning a sorted list yet, but that will be fixed before release.

k4bek4be commented 3 years ago

I've ran into an issue with unrealircd and gamja cooperation. The server has a setting to enable history on certain channels, not storing any for the rest of them. Gamja requests CHATHISTORY for any joined channel (i have not read the raw log), and if it happens to be the one with history not enabled, unrealircd sends FAIL INVALID_TARGET. In response, the client displays error messages (in two places) and gives up on any further requests (for different channels). I think this case should be defined in documentation, and my suggestion is to make the IRCd act as there's no history stored, without the hard FAIL error. INVALID_TARGET is defined by the current spec as either no permissions for the target, or a non-existing target. None of these is true here.

progval commented 3 years ago

CHATHISTORY TARGETS should solve this issue when Gamja will implement it, right?

emersion commented 3 years ago

gamja already uses CHATHISTORY TARGETS instead of requesting history on JOIN: https://git.sr.ht/~emersion/gamja/commit/91208a6d47d4ca5d60cad21a41018763585a2209

emersion commented 3 years ago

Ah, but if you JOIN a channel then switch to its tab, gamja will try to request history for it to populate the scrollback. CHATHISTORY TARGETS won't help here.

syzop commented 3 years ago

CHATHISTORY TARGETS won't offer (complete) help here. Someone can always join a channel later. It's fine if someone sends CHATHISTORY TARGETS after a (re)connect, before or after the mass re-join. But I hope nobody is suggestion to send CHATHISTORY TARGETS on every and each later JOIN that happens in a session. The TARGETS subcommand requires a lot of IRCd resources, possibly the most resources of any IRC command out there, and on the IRCd side I am likely to throttle it a bit extra because of that.

Personally, if I was a client coder, I would just try to fetch history of every channel that I join, and deal gracefully with any error conditions.

I see CHATHISTORY TARGETS mainly being useful for missed PM's and such, but implemented it anyway in UnrealIRCd.

emersion commented 3 years ago

deal gracefully with any error conditions

Right. But I wonder what "deal gracefully" means with the existing spec. INVALID_TARGET can be a "real" error as well as a "history disabled" error. I'd rather have an explicit error code for "history disabled", so that I can properly inform the user what happened, without making it look like an error.

progval commented 3 years ago

INVALID_TARGET can be a "real" error

This should be covered by MESSAGE_ERROR

slingamn commented 3 years ago

That's my understanding as well; MESSAGE_ERROR is a server error (like an HTTP 50x) and INVALID_TARGET is the server reporting that history is not available (covering the same cases, as, e.g., 404 Not Found, 401 Unauthorized).

INVALID_TARGET should never be straightforwardly retryable (you'd expect to get the same error again) but it's possible that it's recoverable in some other way? If so (and if there are multiple distinct ways of recovering) there might be a case for separate error codes, but right now I don't see the use case.

k4bek4be commented 3 years ago

About retrying, i see one case: the user joined a channel when history was not enabled, left, then someone enabled the history. The user should receive it after re-joining. There's also no reason to stop asking for other targets when one fails (the server should maybe use CAP DEL if the history has somehow failed completely). I don't know why gamja does it, can't find any reason browsing the source, so this may as well be a bug.

emersion commented 2 years ago

Something else that came up again is: how should clients figure out that they've reached the end of the chat history (ie, there are no more messages to fetch)?

Some servers might want to filter out messages if the user doesn't have the permission to see them.

gamja currently assumes that if the server returns less messages than the limit, there are no more messages. It would be possible to change this logic to something like zero messages returned means no more history, but in pathological cases (the full page is filtered out by the server) this still breaks. msgid and timestamp based heuristics are error-prone.

References:

jwheare commented 2 years ago

There was a small discussion on this in the draft PR:

Starts here https://github.com/ircv3/ircv3-specifications/pull/393#issuecomment-721265459

Is there scope for a way to indicate that you've reached the end of history either forward or backwards, or are we content for clients to just keep requesting more until they stop seeing new messages?

slingamn commented 2 years ago

There are two slightly different issues here:

(a) If the client receives fewer messages than the requested limit, can the client assume that there are no more messages available? (b) Assuming the answer to (a) is "yes", the case where the paging window is full is still ambiguous: the paging window could be full even though there are no more messages available. Should we amend the spec to add an explicit indication of whether there are more messages available?

I'm leaning towards yes on (a) and no on (b). If other people feel similarly, I'll amend the spec to make (a) explicit.

slingamn commented 2 years ago

Sorry, I think I neglected two open issues on this thread. The first is the resource cost of CHATHISTORY TARGETS:

The TARGETS subcommand requires a lot of IRCd resources, possibly the most resources of any IRC command out there, and on the IRCd side I am likely to throttle it a bit extra because of that.

It seems to me that CHATHISTORY TARGETS should have a cost comparable to LIST. What makes it so expensive?

The other is the incompatibility between gamja and unrealircd. Is this still an issue? What can be done to resolve it?

slingamn commented 2 years ago

To clarify, the intent of TARGETS was that it would typically list only the channels that the user is currently joined to, which should make it significantly cheaper than LIST.

syzop commented 2 years ago

Sorry, I think I neglected two open issues on this thread. The first is the resource cost of CHATHISTORY TARGETS:

The TARGETS subcommand requires a lot of IRCd resources, possibly the most resources of any IRC command out there, and on the IRCd side I am likely to throttle it a bit extra because of that.

It seems to me that CHATHISTORY TARGETS should have a cost comparable to LIST. What makes it so expensive?

The thing is that, for each channel that the user is in, it needs to do quite a bit of processing (per channel!) due to the two arguments that TARGETS supports (start and end timestamp) and the need to return something based on that. Due to the timestamps I basically have to travel line by line through the chat history of a channel (can stop on the first match though). And the thing is with TARGETS we need to do that processing * xx channels. In UnrealIRCd we only have to deal with channels and not with PM's, so that maximizes it at max per channels which is usually 10. I just benchmarked it with a few 500+ lines history channels and calculated what it would be if we had 10 such channels. It takes about 2,5msec then, so we can handle 400 of those commands per second per server. That's something we can handle, although I will probably impose a bit more penalty/ratelimit than we have now. But yeah, if you have a server implementation that would (also) store dozens of PM's, and thus needing to traverse like 100 of targets, and/or if they would not be in fast memory like in UnrealIRCd but in say.. SQL or something.. then yeah... I think you can see the problem :D

syzop commented 2 years ago

The other is the incompatibility between gamja and unrealircd. Is this still an issue? What can be done to resolve it?

I kinda lost track of this one. What do you want me to send when history is not enabled for a channel (or any target)? Let me know and if this is different than what we send now I will be happy to update UnrealIRCd.

emersion commented 2 years ago

The other is the incompatibility between gamja and unrealircd. Is this still an issue? What can be done to resolve it?

gamja used to completely stop fetching history when a single CHATHISTORY query failed. This was a gamja bug but it's now fixed.

gamja will still print an error message to the user if the server sends FAIL INVALID_TARGET.

slingamn commented 2 years ago

The thing is that, for each channel that the user is in, it needs to do quite a bit of processing (per channel!) due to the two arguments that TARGETS supports (start and end timestamp) and the need to return something based on that. Due to the timestamps I basically have to travel line by line through the chat history of a channel (can stop on the first match though). And the thing is with TARGETS we need to do that processing * xx channels.

The intent of the specification was that the selectors would only match the time of the latest message sent in the channel:

ordered by the time of the latest message in the channel history or direct message conversation

This probably needs to be clarified explicitly in the spec, but this is the intended meaning and the behavior implemented in Ergo --- the selection parameters in TARGETS are merely pagination parameters with respect to the ordering of the targets by the time of the latest message sent.

emersion commented 2 years ago

Yeah, you should be able to just store one timestamp per channel for a naive impl.

awfulcooking commented 2 years ago

From chats on OFTC, there's some agreement that history would be helpful, in particular for support channels, but it really needs to be opt-in per channel.

At the moment the spec doesn't speak to that, but I think it would help to encourage networks to adopt the feature if e.g. a channel mode could be specced.

slingamn commented 2 years ago

I don't think we'd be able to agree on a mode letter, but also, this seems like the kind of question (opt-in versus opt-out) that is more naturally left implementation-defined or operator-configurable?

I think this could be a good fit for an addition to the "implementation considerations" or "security considerations" section, something along the lines of: "Implementations should give server and channel operators flexible and transparent mechanisms to enable or disable history, according to the needs of individual communities on their networks."

awfulcooking commented 2 years ago

It could be left to implementations to define, but fragmentation would be inevitable.

A standard mode would let clients represent in the UI when a channel has history enabled, and give a clear compatibility target for devs.

Currently Unreal and Insp implement channel history, controlled with mode +H n:t, where n is the number of messages to keep, and t is the time period to keep them for.

Resetting the mode will clear the history buffer.

Could there be consensus on standardising +H? I think it would be a great addition.

syzop commented 2 years ago

I think the whole +H stuff and notes on configuration would be a bit out of scope. Some reasons have been mentioned, but also because: 1) There will be servers that always have chat history enabled that don't set any channel mode (no opt-in nor opt-out), i don't think it is for us to judge on that. 2) Chat history could also be very well implemented in a bouncer, providing it to clients, even when servers don't support it (and hence won't support +H either). 3) Finally, chat history can also apply to non-channels (PM's), which is actually much more privacy sensitive, and in such a case no MODE applies.

I think this spec is in the final stages, just needs some polishing / clearing up and then it is done. Or am I too optimistic? :D

Also, perhaps the opening post can be updated a little to include UnrealIRCd under servers, with a checkmark, as it is implemented since 5.2.0 which was released on 14 Jun 2021 and is loaded by default. Makes it clear that the draft implementation is available a lot more than how it looks now if a random new person looks at this PR for the first time. (And of course, if you know of more implementations add them too)

syzop commented 2 years ago

The other is the incompatibility between gamja and unrealircd. Is this still an issue? What can be done to resolve it?

gamja used to completely stop fetching history when a single CHATHISTORY query failed. This was a gamja bug but it's now fixed.

gamja will still print an error message to the user if the server sends FAIL INVALID_TARGET.

So... should gamja be changed in some way, or must we change something on the UnrealIRCd-side?

Just want to know what needs to be done and by who :)

I have no particular opinion on the matter.. we send FAIL INVALID_TARGET now which may be totally fine, or we can send an empty batch instead, or we can send a new error for this particular case. Whatever needs to be done.

emersion commented 2 years ago

I'd slightly prefer it if the server sent an empty batch, but I don't feel strongly about it. @slingamn said he feels the same on IRC.

syzop commented 2 years ago

Ok, done, sending an empty batch now. Should be in next UnrealIRCd release (5.2.3 / 6.0.1).

progval commented 1 year ago

@slingamn I think you can add Goguma to the list of clients

delthas commented 1 year ago

senpai also supports CHATHISTORY FWIW

emersion commented 1 year ago

One thing I'd like to get more feedback on before ratification is CHATHISTORY AROUND. I'm not aware of any client implementing it yet.

progval commented 1 year ago

I feel it only makes sense when linking specific messages, but no one implements IRC URLs either :/

slingamn commented 1 year ago

It also makes sense in the context of text search. #496 says:

It is not a goal of this specification to offer context around matching messages, as this is covered by the chathistory specification.

emersion commented 1 year ago

I feel it only makes sense when linking specific messages, but no one implements IRC URLs either :/

Ref https://github.com/ircv3/ircv3-ideas/issues/8#issuecomment-842986798