matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.8k stars 2.13k forks source link

rejected event used as an auth event #9595

Open richvdh opened 3 years ago

richvdh commented 3 years ago

in my database, $h+BUjV0LuuSRmFh5ZFeSVZkP+oo6v7bhyJ+1wYjpLb4 (a regular membership event) uses $NncVOvrPzkKl0u1Q8FIJ7Pz1kZH1iF2lwWj6pZ6+Kkg (a membership event which was rejected due to missing auth events) as an auth event. This seems wrong: $h+BUjV0LuuSRmFh5ZFeSVZkP+oo6v7bhyJ+1wYjpLb4 should have been rejected too.

Logs from the arrival of that event:

2021-03-08 16:00:19,930 - synapse.handlers.federation - 189 - INFO - PUT-2238636-$h+BUjV0LuuSRmFh5ZFeSVZkP+oo6v7bhyJ+1wYjpLb4 - handling received PDU: <FrozenEventV2 event_id='$h+BUjV0LuuSRmFh5ZFeSVZkP+oo6v7bhyJ+1wYjpLb4', type='m.room.member', state_key='@yan:yetanothernerd.xyz'>
2021-03-08 16:00:20,090 - synapse.handlers.federation - 2406 - INFO - PUT-2238636-$h+BUjV0LuuSRmFh5ZFeSVZkP+oo6v7bhyJ+1wYjpLb4 - auth_events refers to events which are not in our calculated auth chain: {'$NncVOvrPzkKl0u1Q8FIJ7Pz1kZH1iF2lwWj6pZ6+Kkg'}
2021-03-08 16:00:20,090 - synapse.state - 452 - INFO - PUT-2238636-$h+BUjV0LuuSRmFh5ZFeSVZkP+oo6v7bhyJ+1wYjpLb4 - Resolving state for !BAXLHOFjvDKUeLafmO:matrix.org with 2 groups
2021-03-08 16:00:20,091 - synapse.handlers.federation - 2446 - INFO - PUT-2238636-$h+BUjV0LuuSRmFh5ZFeSVZkP+oo6v7bhyJ+1wYjpLb4 - After state res: updating auth_events with new state {}
2021-03-08 16:00:20,501 - synapse.state - 573 - INFO - PUT-2238636-$h+BUjV0LuuSRmFh5ZFeSVZkP+oo6v7bhyJ+1wYjpLb4 - Resolving state for !BAXLHOFjvDKUeLafmO:matrix.org with groups [2956930, 2831560, 2969615, 2956433, 2893843]
2021-03-08 16:00:21,115 - synapse.storage.databases.main.event_federation - 230 - INFO - PUT-2238636-$h+BUjV0LuuSRmFh5ZFeSVZkP+oo6v7bhyJ+1wYjpLb4 - Unexpectedly found that events don't have chain IDs in room !BAXLHOFjvDKUeLafmO:matrix.org: {'$NncVOvrPzkKl0u1Q8FIJ7Pz1kZH1iF2lwWj6pZ6+Kkg'}
2021-03-08 16:01:19,925 - synapse.http.site - 219 - INFO - PUT-2238636 - Connection from client lost before response was sent
2021-03-08 16:03:22,528 - synapse.state.v2 - 531 - WARNING - PUT-2238636-$h+BUjV0LuuSRmFh5ZFeSVZkP+oo6v7bhyJ+1wYjpLb4 - auth_event id $S38_aRG4Qz8eIdrTyzJuQdH3Ieus1Y1dhczoFKnOB7M for event $NncVOvrPzkKl0u1Q8FIJ7Pz1kZH1iF2lwWj6pZ6+Kkg is missing
2021-03-08 16:03:30,846 - synapse.http.server - 636 - WARNING - PUT-2238636 - Not sending response to request <XForwardedForRequest at 0x7fd1acb68860 method='PUT' uri='/_matrix/federation/v1/send/1614863178168' clientproto='HTTP/1.0' site='8008'>, already disconnected.
2021-03-08 16:03:30,846 - synapse.access.http.8008 - 316 - INFO - PUT-2238636 - 2a00:1098:84:1c8::157 - 8008 - {matrix.org} Processed request: 190.923sec/-130.921sec (0.982sec, 0.012sec) (0.080sec/187.732sec/31) 0B 200! "PUT /_matrix/federation/v1/send/1614863178168 HTTP/1.0" "Synapse/1.29.0rc1 (b=matrix-org-hotfixes,61a970e25)" [3 dbevts]
richvdh commented 3 years ago

this room is continuing to cause severe problems on my server. I'm going to leave it to try to make federation work again.

ShadowJonathan commented 3 years ago

(Doing /join on that room, it looks to be the "IRC Matrix Bridges" room, for anyone's reference, #irc:matrix.org)

richvdh commented 3 years ago

The main problem here is that _check_event_auth, and its helper _update_auth_events_and_context_for_auth, make incorrect assumptions about whether the auth events being passed in in claimed_auth_event_map have themselves been authed. Unfortunately it's not quite as simple as just checking that they have been authed, because the whole reason we have claimed_auth_event_map (rather than just pulling the events out of the db) is that they haven't been persisted, either.

Really, we need to get rid of the whole premise of claimed_auth_event_map - we should not be persisting events before their auth_events are persisted. Unfortunately, if we just remove it, auth_events does something completely different wherein it tries to work out what the auth events should be based on the current state of the room.

The two methods are also very confused about whether the auth events they are working with are those according to the auth_events of the event being persisted, or what we think the auth events should be given the state of the room at that point (which obviously only makes sense for non-outliers, not that that stops us trying to do it anyway).

richvdh commented 2 years ago

So I think we now have PRs that should stop this happening again in the future. The next question is whether we can do anything about existing brokenness in peoples' databases.

ShadowJonathan commented 2 years ago

The next question is whether we can do anything about existing brokenness in peoples' databases.

I actually had a heated discussion about this in #matrix-spec yesterday, the consensus (and logical decision) is to - when re-validating - throw away the entire history of a room from that point on.

Here is a link to the discussion, but the talk about this requirement and me wrestling with it and trying to find an alternative is a bit further up and down from that point.

Thank you for clarifying that the issue has been fixed, though.

callahad commented 2 years ago

@richvdh #11012 described itself as "the final piece of the jigsaw for #9595"

With that merged, can we can close this issue?

richvdh commented 2 years ago

I guess so. The problem is that we're still going to see this on an ongoing basis until we clear out existing problematic data.

worldofgeese commented 2 years ago

Is this related to a batch of rooms I can't join that I saved the join links to that when pressed all give "Auth events could not be found"?

Can someone help me clear the database entries in my postgres db so I can join these rooms?

I can join other rooms but once I've "poisoned" a room with a join link then I can't enter that room even if I manually find it through the room browser.

Here are the join links (beware they will probably break the rooms for you too)

https://matrix.to/#/!EoRhMvNpnWxCMTMPeP:libera.chat?via=geese.party&via=libera.chat&via=matrix.org

https://matrix.to/#/!YLTeaulxSDauOOxBoR:matrix.org?via=geese.party&via=gitter.im&via=matrix.org

https://matrix.to/#/!hokCjFXtQcxTAIXSdZ:matrix.org?via=geese.party&via=matrix.org&via=privacytools.io

https://matrix.to/#/!GryYovOTNVgikENmcX:libera.chat?via=geese.party&via=libera.chat&via=matrix.org

https://matrix.to/#/!NicAJNwJawmHrEhqZs:matrix.org?via=geese.party&via=matrix.org&via=nordgedanken.dev

parisni commented 2 years ago

I guess so. The problem is that we're still going to see this on an ongoing basis until we clear out existing problematic data.

Do we have a way to fix the database ? I have several channel with this issue. @richvdh

UPDATE: I did upgrade the room. People can join the new room. However the previous room still is not accesible from Click here to show older message

pidongqianqian commented 1 year ago

I guess so. The problem is that we're still going to see this on an ongoing basis until we clear out existing problematic data.

Do we have a way to fix the database ? I have several channel with this issue. @richvdh

UPDATE: I did upgrade the room. People can join the new room. However the previous room still is not accesible from Click here to show older message

@parisni What do you mean “upgrade room”? could you tell how to do that? i have same issue and i just want can join new room.

parisni commented 1 year ago

Upgrade a room

This will make the current room read-only, and create a new room see rfc.

Simply type this as a message (you will be asked to invite every participant) :

/upgraderoom 7

Run an api request :

curl -H 'Authorization: Bearer <token-access>' -H "Content-Type: application/json"  -X POST https://matrix.interhop.org/_matrix/client/r0/rooms/<room-id-url-encoded>/upgrade -d '{"new_version": "6"}'