matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.79k stars 2.12k forks source link

Error attempting to resolve state at missing prev_events: 403: Forbidden #15204

Open Cyberes opened 1 year ago

Cyberes commented 1 year ago

Description

I use a script to test federation between matrix.org and my homeserver. It uses two bots. The test goes like this:

Bot 1 is on my HS. Bot 2 is on matrix.org.

  1. Bot 1 creates a room
  2. Bot 1 invites bot 2
  3. Bot 2 joins the room.
  4. Bot 1 sends a message to the room
  5. Using data from both bots the time it took to receive the message is calculated
  6. Both bots leave the room and forget it

This process is then repeated for bot 2.

I logged into the bots today to grab new access tokens and saw that there were a bunch of rooms that they had not left so I manually removed them. I was able to leave all the rooms of the matrix.org bot but was not able to do so for the bot on my HS. I tried nuking the rooms using /_synapse/admin/v2/rooms/ ( with force + purge) but that gave me this error:

2023-03-03 16:56:33,209 - synapse.storage.util.partial_state_events_tracker - 82 - INFO - shutdown_and_purge_room-0 - Awaiting un-partial-stating of events ['$goL0dWWdYWwvAeMR2yIepZPXCxlF9eGkF2Lx6P04UXk']
Stack (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/app/homeserver.py", line 389, in <module>
    main()
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/app/homeserver.py", line 385, in main
    run(hs)
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/app/homeserver.py", line 364, in run
    _base.start_reactor(
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/app/_base.py", line 191, in start_reactor
    run()
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/app/_base.py", line 173, in run
    run_command()
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/app/_base.py", line 148, in <lambda>
    run_command: Callable[[], None] = lambda: reactor.run(),
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/base.py", line 1318, in run
    self.mainLoop()
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/base.py", line 1328, in mainLoop
    reactorBaseSelf.runUntilCurrent()
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/base.py", line 967, in runUntilCurrent
    f(*a, **kw)
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 696, in callback
    self._startRunCallbacks(result)
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 798, in _startRunCallbacks
    self._runCallbacks()
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 892, in _runCallbacks
    current.result = callback(  # type: ignore[misc]
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 1792, in gotResult
    _inlineCallbacks(r, gen, status, context)
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 1782, in _inlineCallbacks
    status.deferred.callback(callbackValue)
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 696, in callback
    self._startRunCallbacks(result)
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 798, in _startRunCallbacks
    self._runCallbacks()
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 892, in _runCallbacks
    current.result = callback(  # type: ignore[misc]
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 696, in callback
    self._startRunCallbacks(result)
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 798, in _startRunCallbacks
    self._runCallbacks()
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 892, in _runCallbacks
    current.result = callback(  # type: ignore[misc]
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 1792, in gotResult
    _inlineCallbacks(r, gen, status, context)
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 1782, in _inlineCallbacks
    status.deferred.callback(callbackValue)
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 696, in callback
    self._startRunCallbacks(result)
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 798, in _startRunCallbacks
    self._runCallbacks()
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 892, in _runCallbacks
    current.result = callback(  # type: ignore[misc]
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 1371, in _cbDeferred
    self.callback(cast(_DeferredListResultListT, self.resultList))
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 696, in callback
    self._startRunCallbacks(result)
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 798, in _startRunCallbacks
    self._runCallbacks()
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 892, in _runCallbacks
    current.result = callback(  # type: ignore[misc]
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 696, in callback
    self._startRunCallbacks(result)
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 798, in _startRunCallbacks
    self._runCallbacks()
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 892, in _runCallbacks
    current.result = callback(  # type: ignore[misc]
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 1792, in gotResult
    _inlineCallbacks(r, gen, status, context)
  File "/opt/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 1697, in _inlineCallbacks
    result = context.run(gen.send, result)
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/metrics/background_process_metrics.py", line 242, in run
    return await func(*args, **kwargs)
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/handlers/pagination.py", line 659, in _shutdown_and_purge_room
    ].shutdown_room = await self._room_shutdown_handler.shutdown_room(
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/handlers/room.py", line 1869, in shutdown_room
    _, stream_id = await self.room_member_handler.update_membership(
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/handlers/room_member.py", line 597, in update_membership
    result = await self.update_membership_locked(
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/handlers/room_member.py", line 826, in update_membership_locked
    state_before_join = await self.state_handler.compute_state_after_events(
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/state/__init__.py", line 219, in compute_state_after_events
    ret = await self.resolve_state_groups_for_events(
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/util/metrics.py", line 113, in measured_func
    r = await func(self, *args, **kwargs)
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/state/__init__.py", line 520, in resolve_state_groups_for_events
    state_groups = await self._state_storage_controller.get_state_group_for_events(
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/logging/opentracing.py", line 901, in _wrapper
    return await func(*args, **kwargs)  # type: ignore[misc]
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/logging/opentracing.py", line 901, in _wrapper
    return await func(*args, **kwargs)  # type: ignore[misc]
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/storage/controllers/state.py", line 374, in get_state_group_for_events
    await self._partial_state_events_tracker.await_full_state(event_ids)
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/logging/opentracing.py", line 901, in _wrapper
    return await func(*args, **kwargs)  # type: ignore[misc]
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/storage/util/partial_state_events_tracker.py", line 82, in await_full_state
    logger.info(

I restarted the server and checked the logs and saw that it was failing to re-sync the broken rooms with errors such as:

2023-03-03 18:05:22,514 - synapse.handlers.federation_event - 1073 - WARNING - sync_partial_state_room-9-$utvlr8R3FKFrtoRlZxyTAtAT6ybEQGW0o6kar9N_aNY - Error attempting to resolve state at missing prev_events: 403: Forbidden
2023-03-03 18:05:22,515 - synapse.handlers.federation - 1924 - ERROR - sync_partial_state_room-9 - Failed to get state for !HsKRUpcYhnAgwvbSgH:matrix.org at <FrozenEventV3 event_id=$utvlr8R3FKFrtoRlZxyTAtAT6ybEQGW0o6kar9N_aNY, type=m.room.member, state_key=@evulid-cc-federation-tester-on-evulid-cc:evulid.cc, outlier=False> from matrix.org because ERROR 403: We can't get valid state history., giving up!
2023-03-03 18:05:22,515 - synapse.metrics.background_process_metrics - 244 - ERROR - sync_partial_state_room-4 - Background process 'sync_partial_state_room' threw an exception
Traceback (most recent call last):
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/handlers/federation_event.py", line 1056, in _compute_event_context_with_maybe_missing_prevs
    await self._get_state_ids_after_missing_prev_event(
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/logging/opentracing.py", line 901, in _wrapper
    return await func(*args, **kwargs)  # type: ignore[misc]
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/logging/opentracing.py", line 901, in _wrapper
    return await func(*args, **kwargs)  # type: ignore[misc]
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/handlers/federation_event.py", line 1119, in _get_state_ids_after_missing_prev_event
    ) = await self._federation_client.get_room_state_ids(
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/logging/opentracing.py", line 901, in _wrapper
    return await func(*args, **kwargs)  # type: ignore[misc]
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/logging/opentracing.py", line 901, in _wrapper
    return await func(*args, **kwargs)  # type: ignore[misc]
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/federation/federation_client.py", line 505, in get_room_state_ids
    result = await self.transport_layer.get_room_state_ids(
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/federation/transport/client.py", line 78, in get_room_state_ids
    return await self.client.get_json(
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/http/matrixfederationclient.py", line 1077, in get_json
    response = await self._send_request_with_optional_trailing_slash(
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/http/matrixfederationclient.py", line 406, in _send_request_with_optional_trailing_slash
    response = await self._send_request(request, **send_request_args)
  File "/opt/matrix-synapse/lib/python3.10/site-packages/synapse/http/matrixfederationclient.py", line 668, in _send_request
    raise exc
synapse.api.errors.HttpResponseException: 403: Forbidden

Doing /_synapse/admin/v1/purge_history/ on the room yields this:

{"status":"failed","error":"update or delete on table \"events\" violates foreign key constraint \"partial_state_events_event_id_fkey\" on table \"partial_state_events\"\nDETAIL:  Key (event_id)=($U259e4xDM9mlnGzxtK3kGpToqUX8QypNQQ5Tj8lcIwI) is still referenced from table \"partial_state_events\".\n"}

I've thought about nuking everything (https://github.com/matrix-org/synapse/issues/14300#issuecomment-1294969068) but thought I'd file a bug report first.

Steps to reproduce

???

Homeserver

evulid.cc + matrix.org

Synapse Version

1.77.0

Installation Method

pip (from PyPI)

Database

PostgreSQL, no

Workers

Multiple workers

Platform

Ubuntu 22.04 LXC

Configuration

Shorter msg retention

Relevant log output

See above.

Anything else that would be useful to know?

No response

Cyberes commented 1 year ago

I just updated to v1.78, restarted Synapse and all the workers, and now all the broken rooms are gone from my bot's account.

image

So issue solved, I guess?

Cyberes commented 1 year ago

Re-opened. I would appreciate it if a dev reviewed this to make sure it isn't a deeper issue with Synapse.

DMRobertson commented 1 year ago

Looks like there are two problems:

Cyberes commented 1 year ago

Cool. I checked the bots again today and there were a bunch of rooms left over again. I didn't have any issues this time removing them manually. I also added a function to my script to leave every room (instead of just the ones created for that test) and log any errors from leaving/forgetting rooms. I'll update you if I get any logs.