matrix-org / dendrite

Dendrite is a second-generation Matrix homeserver written in Go!
https://matrix-org.github.io/dendrite/
Apache License 2.0
5.67k stars 664 forks source link

Dendrite server stuck/not responding #3366

Open hensg opened 5 months ago

hensg commented 5 months ago

Background information

Description

Apr 13 00:21:34 matrix-dendrite dendrite[50475]: time="2024-04-13T00:21:34.916263901Z" level=info msg="Processing registration request" auth.type=m.login.dummy req.id=D07R8ffP3r5r req.method=POST req.path=/_matrix/client/v3/register session_id="" username=<redacted>
Apr 13 00:21:35 matrix-dendrite dendrite[50475]: time="2024-04-13T00:21:35.012944307Z" level=info msg="PerformDeviceCreation" device_id="<nil>" display_name=<redacted> localpart=<redacted> req.id=D07R8ffP3r5r req.method=POST req.path=/_matrix/client/v3/register
Apr 13 00:25:18 matrix-dendrite dendrite[50475]: time="2024-04-13T00:25:18.193324285Z" level=info msg="Creating new room" req.id=SiLzMVzH2l9W req.method=POST req.path=/_matrix/client/v3/createRoom roomID=<redacted> roomVersion=10 userID=<redacted> user_id=<redacted>
Apr 13 00:28:48 matrix-dendrite dendrite[50475]: time="2024-04-13T00:28:48.534446323Z" level=info msg="Creating new room" req.id=8FY47hJw1rTr req.method=POST req.path=/_matrix/client/v3/createRoom roomID=<redacted> roomVersion=10 userID=<redacted> user_id=<redacted>
Apr 13 00:29:57 matrix-dendrite dendrite[50475]: time="2024-04-13T00:29:57.138855878Z" level=error msg="unable to update event filter with ignored users" device_id=DE-q7arv error="context canceled" limit=50 req.id=eViRvKWwr2Zw req.method=GET req.path=/_matrix/client/r0/sync since=s328_2_67_9_180_329_100_290_0 timeout=30s user_id=<redacted>
Apr 13 00:29:57 matrix-dendrite dendrite[50475]: time="2024-04-13T00:29:57.138958418Z" level=error msg="d.addRoomDeltaToResponse failed" device_id=DE-q7arv error="p.DB.RecentEvents: sql: transaction has already been committed or rolled back" limit=50 req.id=eViRvKWwr2Zw req.method=GET req.path=/_matrix/client/r0/sync since=s328_2_67_9_180_329_100_290_0 timeout=30s user_id=<redacted>
Apr 13 00:29:57 matrix-dendrite dendrite[50475]: time="2024-04-13T00:29:57.139007999Z" level=error msg="Failed to acquire database snapshot for sync request" error="context canceled"
Apr 13 00:29:57 matrix-dendrite dendrite[50475]: time="2024-04-13T00:29:57.139036961Z" level=error msg="Failed to acquire database snapshot for sync request" error="context canceled"
Apr 13 00:29:57 matrix-dendrite dendrite[50475]: time="2024-04-13T00:29:57.139059912Z" level=error msg="Failed to acquire database snapshot for sync request" error="context canceled"

Wondering if it's related to this continue and the error not being handled a few lines above where it returns the newPos: https://github.com/matrix-org/dendrite/blob/main/syncapi/streams/stream_pdu.go#L232

Also, I observed a few queries running for too long in the db:

 SELECT pid, now() - pg_stat_activity.query_start AS duration, query
FROM pg_stat_activity
WHERE (now() - pg_stat_activity.query_start) > interval '2 minutes';
  pid  |    duration     |                                                                                     query
-------+-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 10357 | 00:02:04.219947 | BEGIN READ WRITE
 11644 | 00:02:04.261905 | SELECT room_id, deleted, (id > $3 AND id <= $4) AS changed FROM syncapi_peeks WHERE user_id = $1 AND device_id = $2 AND ((id <= $3 AND NOT deleted) OR (id > $3 AND id <= $4))
 11410 | 00:02:04.262119 | SELECT room_id, deleted, (id > $3 AND id <= $4) AS changed FROM syncapi_peeks WHERE user_id = $1 AND device_id = $2 AND ((id <= $3 AND NOT deleted) OR (id > $3 AND id <= $4))
 11647 | 00:02:04.213341 | SELECT room_id, deleted, (id > $3 AND id <= $4) AS changed FROM syncapi_peeks WHERE user_id = $1 AND device_id = $2 AND ((id <= $3 AND NOT deleted) OR (id > $3 AND id <= $4))
 11643 | 00:02:04.213535 | SELECT room_id, deleted, (id > $3 AND id <= $4) AS changed FROM syncapi_peeks WHERE user_id = $1 AND device_id = $2 AND ((id <= $3 AND NOT deleted) OR (id > $3 AND id <= $4))
 11649 | 00:02:04.238318 | SELECT room_id, deleted, (id > $3 AND id <= $4) AS changed FROM syncapi_peeks WHERE user_id = $1 AND device_id = $2 AND ((id <= $3 AND NOT deleted) OR (id > $3 AND id <= $4))
Serlodh commented 3 days ago

@hensg did you find a solution to this by any chance?