Gnuxie / matrix-protection-suite

library for interacting with matrix policy lists for moderation.
2 stars 1 forks source link

Room state concepts need more consideration. #14

Open Gnuxie opened 8 months ago

Gnuxie commented 8 months ago

Something that we've missed for years is that if you manage to find a state event in the timeline, and it's for a brand new type+key combination, then you can always treat it like a state delta. It's not possible for the event to be stale state from your and your server's perspective.

To be able to represent this, we need new state change concepts, changing Added to Introduced and Reintroduced. Where Introduced means that we haven't seen an event of that type+key combination before.

Events that have been Introduced can then be directly inserted by revision issuers into the next revision.

Batching of revisions is something that is inherited from Mjolnir to make sure that server ACL application wouldn't flood a room with m.room.server_acl events if using a script to bulk add policies to a policy room. I'm now of the opinion that batching should be controlled protection side. However, revision issuers should at least perform some batching so that introduced events received within the same sync response end up within the same revision.

Revision issuers should take a ULID before beginning the process of fetching state and revision issuers should then give this ULID to the next revision. This ensures that consumers of revisions that make assumptions and pre-empt the state of the room can invalidate their predictions by comparing the ULID of the revision to the ULID associated with their predictions and whether they came true or not. (For onlookers, we only use ULIDs within the same process) Preemption can only be done in cases where the state is a certainty, e.g. introduced state. We can't use ULID's like this because introduced state can revise a revision while a request is being sent across the network.
Gnuxie commented 7 months ago

. o O ( Make sure that the absence of an event in a /state response does not remove the event from the revision, if this happens, it means that the event was introduced after the response had time to be processed, this is pretty critical and DOES need testing. I feel like the code might do something dumb here )

Gnuxie commented 7 months ago

We need to make sure that introduced state is processed and revises revisions before calling client handles for timelineEvents. This is to avoid a situation where a protection gets a timeline event from a room member that has no membership at all (it's less important if their membership is for instance a stale part).

To make this work we need to ensure that the room state revision issuer is informed about all events from a single transaction or sync response before any of those events are passed onto protections.

This requires rebundling events from the onEvent and room.event handles from matrix-appservice-bridge and the matrix-bot-sdk respectively, giving the entire bundle to the room state revision issuers, and then continuing. It's not obvious to me yet whether we need to bundle on a per room basis or not? for room state revision issuers we would need to extract all events for one room anyways, so we may aswell do that work and stop it being duplicated if we need to have the same pattern elsewhere in future.

The API should therefore be designed around per room event activity and the bundler for per event client backends should just be seen as optional, unrelated glue that can be stuck on to make it work.

Gnuxie commented 7 months ago

IDK Why APIs are insistent on splitting it down per event anyways, i guess they kept having to do that in their code which fair enough yeah is annoying but idk if it should happen this early on meow

Gnuxie commented 7 months ago
export type OnRoomEventActivity = (
  roomID: StringRoomID,
  events: RoomEvent[]
) => void;

export class ConstantPeriodBatch {
  private finished = false;
  constructor(cb: () => void, delayMS = 0) {
    setTimeout(() => {
      this.finished = true;
      cb();
    }, delayMS);
  }

  public isFinished() {
    return this.finished;
  }
}

/**
 * This is a bodge to turn clients that give us a per event handler back into
 * something that bundles events together.
 * If you can avoid doing this by writing your own sync loop or `/transactions`
 * handler, then do so because this is going to suck a little.
 */
export class OnEventToOnRoomEventActivityConverter {
  private readonly eventsByRoom = new Map<StringRoomID, RoomEvent[]>();
  private batcher = new ConstantPeriodBatch(() => undefined);
  private readonly batchHandler = this.handleBatch.bind(this);
  public constructor(private readonly cb: OnRoomEventActivity) {
    // nothing to do.
  }

  private internRoomEvent(event: RoomEvent): void {
    const entry = this.eventsByRoom.get(event.room_id);
    if (entry === undefined) {
      this.eventsByRoom.set(event.room_id, [event]);
    } else {
      entry.push(event);
    }
  }

  public handleTimelineEvent(roomID: StringRoomID, event: RoomEvent): void {
    if (this.batcher.isFinished()) {
      this.batcher = new ConstantPeriodBatch(this.batchHandler);
    }
    this.internRoomEvent(event);
  }

  private handleBatch(): void {
    for (const [roomID, events] of this.eventsByRoom.entries()) {
      this.cb(roomID, events);
    }
    this.eventsByRoom.clear();
  }
}

IDK I have just written the most ridiculous code in the world.

Gnuxie commented 7 months ago

Why do we want them to be batched? Because we can't batch state events together before clients process them without doing this as early on in the process as possible, for all events. Why do we want state events to be batched? We think it's because we don't want to have to do batching per protection for things like policy updates, resyncing multiple times or applying server ACL a lot.

The batching does come a cost that when a protection receives an event, the local state of the room will be in the future, but in reality that's probably fine and can probably happen regardless because of funky latency and network bs.

Gnuxie commented 6 months ago

https://github.com/Gnuxie/matrix-protection-suite/issues/20

We're always gonna have an outdated version so i don't really think we should worry about keeping the write back as up to date as we ere previously worrying about

on top of that, we might need to pull policy lists in in the background anyhow once we have loaded the cache, just to make sure they are up to date.

Gnuxie commented 6 months ago

The entire point of this though is to stop something like the matrix.org instance taking 10minuets to start up and