matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.82k stars 2.13k forks source link

Federation fails when using a DNS server on a link-local IPv6 address #5846

Open jans23 opened 5 years ago

jans23 commented 5 years ago

Our Matrix instance nitro.chat has issues connecting to matrix.org rooms. In the logs we see these messages:

2019-08-12 16:51:28,877 - synapse.crypto.keyring - 652 - INFO - POST-104458- Requesting keys dict_items([('matrix.org', {'ed25519:auto':1565621488877})]) from notary server matrix.org
2019-08-12 16:51:28,879 - synapse.crypto.keyring - 604 - WARNING - POST-104458- Key lookup failed from 'matrix.org': Failed to connect to remote server
2019-08-12 16:51:28,880 - synapse.http.matrixfederationclient - 404 - INFO - POST-104458- {GET-O-3981} [matrix.org] Sending request: GET matrix://matrix.org/_matrix/key/v2/server/ed25519%3Aauto; timeout 10.000000s
2019-08-12 16:51:28,880 - synapse.http.federation.matrix_federation_agent - 335 - INFO - POST-104458- Fetching https://matrix.org/.well-known/matrix/server
2019-08-12 16:51:28,984 - synapse.http.federation.matrix_federation_agent - 351 - INFO - POST-104458- Error fetching https://matrix.org/.well-known/matrix/server: Non-200 response 404
2019-08-12 16:51:38,894 - synapse.http.matrixfederationclient - 485 - WARNING - POST-104458- {GET-O-3981} [matrix.org] Request failed: GET matrix://matrix.org/_matrix/key/v2/server/ed25519%3Aauto: TimeoutError(10.0, 'Deferred')
2019-08-12 16:51:38,894 - synapse.crypto.keyring - 783 - WARNING - POST-104458- Error looking up keys {'ed25519:auto': 1565621488877} from matrix.org: Failed to connect to remote server
2019-08-12 16:51:38,895 - synapse.federation.transport.server - 286 - WARNING - POST-104458- authenticate_request failed: 401: No key for matrix.org with ids in ['ed25519:auto'] (min_validity 1565621488877)
2019-08-12 16:51:38,895 - synapse.http.server - 79 - INFO - POST-104458- <XForwardedForRequest at 0x7f711034b278 method='POST' uri='/_matrix/federation/v1/get_groups_publicised' clientproto='HTTP/1.0' site=8008> SynapseError: 401 - No key for matrix.org with ids in ['ed25519:auto'] (min_validity 1565621488877)
2019-08-12 16:51:38,896 - synapse.access.http.8008 - 302 - INFO - POST-104458- 2a04:3541:1000:500:6866:a4ff:fe4c:58c1 - 8008 - {None} Processed request: 10.018sec/0.000sec (0.013sec, 0.001sec) (0.000sec/0.001sec/1) 118B 401 "POST /_matrix/federation/v1/get_groups_publicised HTTP/1.0" "Synapse/1.2.1 (b=matrix-org-hotfixes,873ff9522)" [0 dbevts] root@matrix-nitrochat:/etc/matrix-synapse# tail /var/log/matrix-synapse/homeserver.log -n 200

curl matrix.org/_matrix/key/v2/server/ed25519%3Aauto says:

{"old_verify_keys":{},"server_name":"matrix.org","signatures":{"matrix.org":{"ed25519:auto":"IV3Ql81UyFkZdPD9JbdxcqMeKoflQCiR1OgUvumu87tNFa9XC/gNdtcBn2xIYut73KZ1IFZZDvGdlwgdTUxvBw"}},"tls_fingerprints":[{"sha256":"Pq3kW4n2StFnOFtA0PTiXIk0HZF6jwLyR8tMlaRko6I"},{"sha256":"mVNAmfjFleUpJseTNjU5EhJAlfl5rjb7JJPG2UeH9eY"}],"valid_until_ts":1565756981005,"verify_keys":{"ed25519:auto":{"key":"Noi6WqcDj0QmPxCNQqgezwTlBKrfqehY1u2FyWP9uYw"}}}

We run the latest Synapse version on Linux already. Азат at #synapse:matrix.org confirmed it's a Synapse issue. Please help.

richvdh commented 5 years ago

please could you fix the log formatting so that the lines aren't broken?

jans23 commented 5 years ago

@richvdh formatting is updated now.

richvdh commented 5 years ago

Firstly, it's not a 404, it's a timeout. It looks like it can't connect to matrix.org. It's possible you just caught it when matrix.org was having a bad time, but more likely there is something wrong with your networking, or possibly your server is very overloaded.

It would probably be worthwhile to turn the log level up to DEBUG so we can see where it's getting stuck.

benksio commented 5 years ago

hi @richvdh, i also deal with the above issue on the very same server. would you tell me which level line in log.yaml needs to be set do DEBUG? thx

richvdh commented 5 years ago

the log configuration file follows the standard format for python log configuration: https://docs.python.org/3/library/logging.config.html#dictionary-schema-details.

TL;DR: change one of the level lines.

benksio commented 5 years ago

@richvdh I get a very verbose log now. As there is sensitive data within it, i'd prefare to send it directly to you. Is that possible?

richvdh commented 5 years ago

you can reach me on matrix at @richvdh:sw1v.org

richvdh commented 5 years ago

Request from the matrix.org end shows:

2019-08-16 05:51:31,870 - synapse.http.matrixfederationclient - 520 - WARNING - federation_transaction_transmission_loop-2320408 - {PUT-O-1823760} [nitro.chat] Request failed: PUT matrix://nitro.chat/_matrix/federation/v1/send/1565867599575/: HttpResponseException("401: b'Unauthorized'")
gfiury commented 5 years ago

We fixed the problem doing the following:

  1. Restarting the server that has Synapse 1.0.0 version For some reason, I would like to confirm this fact with you guys, is that the key changed (and the signature) after restarting the server (not Synapse but the CentOS)
  2. The origin Server (where a user tries to invite another user from the Synapse 1.0.0) has 1.2.1 version. There is a section on the config to change the trusted keys. That sections is commented by default, but the logs showed that first tries the request to a notary server (matrix.org) and then proper server. We uncomment the property trusted_key_servers and set the proper server.
jans23 commented 5 years ago

As a user, the issue still exists and isn't solved for me.

richvdh commented 5 years ago

Again: please could somebody running a server which suffers from this problem share the DEBUG logs for a failing request so that we can investigate.

benksio commented 5 years ago

Hi, this seems the right message to me. If not, please give me a hint, what to grep for.

2019-08-21 12:08:26,074 - synapse.federation.transport.server - 288 - WARNING - PUT-19722- authenticate_request failed: 401: No key for jki.re with ids in ['ed25519:auto'] (min_validity 1566381852384)
2019-08-21 12:08:26,074 - synapse.http.server - 79 - INFO - PUT-19722- <XForwardedForRequest at 0x7fe4201b0f98 method='PUT' uri='/_matrix/federation/v1/send/1565688011035' clientproto='HTTP/1.0' site=8008> SynapseError: 401 - No key for jki.re with ids in ['ed25519:auto'] (min_validity 1566381852384)
2019-08-21 12:08:26,075 - synapse.http.server - 447 - WARNING - PUT-19722- Not sending response to request <XForwardedForRequest at 0x7fe4201b0f98 method='PUT' uri='/_matrix/federation/v1/send/1565688011035' clientproto='HTTP/1.0' site=8008>, already disconnected.
2019-08-21 12:08:26,075 - synapse.access.http.8008 - 302 - INFO - PUT-19722- 2a01:7e00::f03c:91ff:fe6e:411b - 8008 - {None} Processed request: 253.691sec/-206.697sec (0.006sec, 0.006sec) (0.001sec/0.036sec/3) 0B 200! "PUT /_matrix/federation/v1/send/1565688011035 HTTP/1.0" "Synapse/1.2.1 (b=erikj/lower_bound_ttl_well_known,8f33d49ab,dirty)" [0 dbevts]
2019-08-21 12:08:26,075 - synapse.crypto.keyring - 274 - DEBUG - PUT-19722- Releasing key lookup lock on jki.re
2019-08-21 12:08:26,076 - synapse.crypto.keyring - 258 - DEBUG - PUT-19731- Got key lookup lock on jki.re
2019-08-21 12:08:26,076 - synapse.crypto.keyring - 642 - INFO - PUT-19731- Requesting keys dict_items([('jki.re', {'ed25519:auto': 1566381903711})]) from notary server matrix.org
2019-08-21 12:08:26,076 - synapse.crypto.keyring - 313 - INFO - PUT-19744- Waiting for existing lookups for ['jki.re'] to complete [loop 2]
2019-08-21 12:08:26,077 - synapse.storage.txn - 408 - DEBUG - PUT-19731- [TXN START] {get_destination_retry_timings-d599}
2019-08-21 12:08:26,077 - synapse.storage.SQL - 164 - DEBUG - PUT-19731- [SQL] {get_destination_retry_timings-d599} SELECT destination, retry_last_ts, retry_interval FROM destinations WHERE destination = ?
2019-08-21 12:08:26,077 - synapse.storage.SQL - 169 - DEBUG - PUT-19731- [SQL values] {get_destination_retry_timings-d599} ['matrix.org']
2019-08-21 12:08:26,078 - synapse.storage.SQL - 183 - DEBUG - PUT-19731- [SQL time] {get_destination_retry_timings-d599} 0.000420 sec
2019-08-21 12:08:26,078 - synapse.storage.txn - 470 - DEBUG - PUT-19731- [TXN END] {get_destination_retry_timings-d599} 0.001383 sec
2019-08-21 12:08:26,079 - synapse.crypto.keyring - 594 - WARNING - PUT-19731- Key lookup failed from 'matrix.org': Failed to connect to remote server
2019-08-21 12:08:26,079 - synapse.http.matrixfederationclient - 404 - INFO - PUT-19731- {GET-O-1309} [jki.re] Sending request: GET matrix://jki.re/_matrix/key/v2/server/ed25519%3Aauto; timeout 10.000000s

[edited for formatting by @richvdh]

richvdh commented 5 years ago

@benksio there are several requests happening in parallel there. You can identify all the log lines for a given request by grepping for the PUT-nnnnn number, as per https://github.com/matrix-org/synapse/wiki#how-can-i-find-the-lines-corresponding-to-a-given-http-request-in-my-homeserver-log.

I can see a couple of things though:

benksio commented 5 years ago

Here you are:

2019-08-21 12:05:03,711 - synapse.access.http.8008 - 233 - INFO - PUT-19731- 2a01:7e00::f03c:91ff:fe6e:411b - 8008 - Received request: PUT /_matrix/federation/v1/send/1565688011035
2019-08-21 12:05:03,711 - synapse.crypto.keyring - 210 - DEBUG - PUT-19731- Verifying Incoming request for jki.re with key_ids ['ed25519:auto'], min_validity 1566381903711
2019-08-21 12:05:03,712 - synapse.crypto.keyring - 313 - INFO - PUT-19731- Waiting for existing lookups for ['jki.re'] to complete [loop 1]
2019-08-21 12:06:03,643 - synapse.http.site - 203 - WARNING - PUT-19731- Error processing request <XForwardedForRequest at 0x7fe4205d4390 method='PUT' uri='/_matrix/federation/v1/send/1565688011035' clientproto='HTTP/1.0' site=8008>: <class 'twisted.internet.error.ConnectionDone'> Connection was closed cleanly.
2019-08-21 12:08:26,076 - synapse.crypto.keyring - 258 - DEBUG - PUT-19731- Got key lookup lock on jki.re
2019-08-21 12:08:26,076 - synapse.crypto.keyring - 642 - INFO - PUT-19731- Requesting keys dict_items([('jki.re', {'ed25519:auto': 1566381903711})]) from notary server matrix.org
2019-08-21 12:08:26,077 - synapse.storage.txn - 408 - DEBUG - PUT-19731- [TXN START] {get_destination_retry_timings-d599}
2019-08-21 12:08:26,077 - synapse.storage.SQL - 164 - DEBUG - PUT-19731- [SQL] {get_destination_retry_timings-d599} SELECT destination, retry_last_ts, retry_interval FROM destinations WHERE destination = ?
2019-08-21 12:08:26,077 - synapse.storage.SQL - 169 - DEBUG - PUT-19731- [SQL values] {get_destination_retry_timings-d599} ['matrix.org']
2019-08-21 12:08:26,078 - synapse.storage.SQL - 183 - DEBUG - PUT-19731- [SQL time] {get_destination_retry_timings-d599} 0.000420 sec
2019-08-21 12:08:26,078 - synapse.storage.txn - 470 - DEBUG - PUT-19731- [TXN END] {get_destination_retry_timings-d599} 0.001383 sec
2019-08-21 12:08:26,079 - synapse.crypto.keyring - 594 - WARNING - PUT-19731- Key lookup failed from 'matrix.org': Failed to connect to remote server
2019-08-21 12:08:26,079 - synapse.http.matrixfederationclient - 404 - INFO - PUT-19731- {GET-O-1309} [jki.re] Sending request: GET matrix://jki.re/_matrix/key/v2/server/ed25519%3Aauto; timeout 10.000000s
2019-08-21 12:08:36,081 - synapse.http.matrixfederationclient - 428 - INFO - PUT-19731- Failed to send request: (10.0, 'Deferred')
2019-08-21 12:08:36,082 - synapse.http.matrixfederationclient - 485 - WARNING - PUT-19731- {GET-O-1309} [jki.re] Request failed: GET matrix://jki.re/_matrix/key/v2/server/ed25519%3Aauto: TimeoutError(10.0, 'Deferred')
2019-08-21 12:08:36,082 - synapse.crypto.keyring - 773 - WARNING - PUT-19731- Error looking up keys {'ed25519:auto': 1566381903711} from jki.re: Failed to connect to remote server
2019-08-21 12:08:36,082 - synapse.federation.transport.server - 288 - WARNING - PUT-19731- authenticate_request failed: 401: No key for jki.re with ids in ['ed25519:auto'] (min_validity 1566381903711)
2019-08-21 12:08:36,082 - synapse.http.server - 79 - INFO - PUT-19731- <XForwardedForRequest at 0x7fe4205d4390 method='PUT' uri='/_matrix/federation/v1/send/1565688011035' clientproto='HTTP/1.0' site=8008> SynapseError: 401 - No key for jki.re with ids in ['ed25519:auto'] (min_validity 1566381903711)
2019-08-21 12:08:36,083 - synapse.http.server - 447 - WARNING - PUT-19731- Not sending response to request <XForwardedForRequest at 0x7fe4205d4390 method='PUT' uri='/_matrix/federation/v1/send/1565688011035' clientproto='HTTP/1.0' site=8008>, already disconnected.
2019-08-21 12:08:36,083 - synapse.access.http.8008 - 302 - INFO - PUT-19731- 2a01:7e00::f03c:91ff:fe6e:411b - 8008 - {None} Processed request: 212.372sec/-152.440sec (0.000sec, 0.003sec) (0.000sec/0.001sec/1) 0B 200! "PUT /_matrix/federation/v1/send/1565688011035 HTTP/1.0" "Synapse/1.2.1 (b=erikj/lower_bound_ttl_well_known,8f33d49ab,dirty)" [0 dbevts]
2019-08-21 12:08:36,083 - synapse.crypto.keyring - 274 - DEBUG - PUT-19731- Releasing key lookup lock on jki.re
benksio commented 5 years ago

here is the ... | grep "[matrix.org] Request failed"

2019-08-21 11:59:06,450 - synapse.http.matrixfederationclient - 485 - WARNING - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Request failed: PUT matrix://matrix.org/_matrix/federation/v1/send/1566201506386: TimeoutError(60, 'Deferred')
2019-08-21 12:01:21,431 - synapse.http.matrixfederationclient - 485 - WARNING - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Request failed: PUT matrix://matrix.org/_matrix/federation/v1/send/1566201506386: TimeoutError(60, 'Deferred')
2019-08-21 12:03:27,044 - synapse.http.matrixfederationclient - 485 - WARNING - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Request failed: PUT matrix://matrix.org/_matrix/federation/v1/send/1566201506386: TimeoutError(60, 'Deferred')
2019-08-21 12:05:12,390 - synapse.http.matrixfederationclient - 485 - WARNING - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Request failed: PUT matrix://matrix.org/_matrix/federation/v1/send/1566201506386: CancelledError()
2019-08-21 12:05:12,391 - synapse.http.matrixfederationclient - 485 - WARNING - PUT-19722- {POST-O-1306} [matrix.org] Request failed: POST matrix://matrix.org/_matrix/key/v2/query: TimeoutError(60, 'Deferred')
2019-08-21 12:06:13,031 - synapse.http.matrixfederationclient - 485 - WARNING - PUT-19722- {POST-O-1306} [matrix.org] Request failed: POST matrix://matrix.org/_matrix/key/v2/query: TimeoutError(60, 'Deferred')
2019-08-21 12:07:14,264 - synapse.http.matrixfederationclient - 485 - WARNING - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Request failed: PUT matrix://matrix.org/_matrix/federation/v1/send/1566201506386: CancelledError()
2019-08-21 12:07:14,265 - synapse.http.matrixfederationclient - 485 - WARNING - PUT-19722- {POST-O-1306} [matrix.org] Request failed: POST matrix://matrix.org/_matrix/key/v2/query: TimeoutError(60, 'Deferred')
2019-08-21 12:08:16,063 - synapse.http.matrixfederationclient - 485 - WARNING - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Request failed: PUT matrix://matrix.org/_matrix/federation/v1/send/1566201506386: CancelledError()
2019-08-21 12:08:16,065 - synapse.http.matrixfederationclient - 485 - WARNING - PUT-19722- {POST-O-1306} [matrix.org] Request failed: POST matrix://matrix.org/_matrix/key/v2/query: TimeoutError(60, 'Deferred')
zfnmxt commented 5 years ago

I'm also getting a 401 error, although I'm too unfamiliar to tell if it's for key reasons. Didn't want to open another issue unnecessarily, though. Here's my log if you can glimmer anything out of it:

https://gist.githubusercontent.com/zfnmxt/267ad8c18cfaf61f507d2be6754a05af/raw/0350c223bd0e734092d64b9afd9efcc12926fe89/synapse.txt

zfnmxt commented 5 years ago

Seems like I only have issues with matrix.org. I'm able to join rooms on other servers without issue.

jans23 commented 5 years ago

This is the same for me.

richvdh commented 5 years ago

@benksio: so it looks like outbound requests (to both jki.re and matrix.org) are timing out, which is a bit odd. I don't see those requests arriving at the matrix.org end. Can you share the results of grep federation_transaction_transmission_loop-1873 ?

(Also: what timezone are your logs in? Looks like UTC+3 ?)

jans23 commented 5 years ago

Just a random idea: Might this issue be related to IPv6?

richvdh commented 5 years ago

@jans23 plausibly

richvdh commented 5 years ago

@zfnmxt I think your issue is unrelated: the 401 in your logs is coming from matrix.org, though I can't see the failure on the matrix.org side. Please remember to tell us what timezone your logs are in. Also, note that your logs contain some secret access tokens, which you should consider compromised: log out your client to invalidate the access token.

zfnmxt commented 5 years ago

@richvdh I figured re: tokens. Timezone is GMT+2.

benksio commented 5 years ago

the Timezone ist CEST

... | grep "federation_transaction_transmission_loop-1873"

2019-08-21 11:59:06,450 - synapse.http.matrixfederationclient - 428 - INFO - federation_transaction_transmission_loop-1873- Failed to send request: (60, 'Deferred')
2019-08-21 11:59:06,450 - synapse.http.matrixfederationclient - 485 - WARNING - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Request failed: PUT matrix://matrix.org/_matrix/federation/v1/send/1566201506386: TimeoutError(60, 'Deferred')
2019-08-21 11:59:06,450 - synapse.http.matrixfederationclient - 505 - DEBUG - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Waiting 74.97580974660222s before re-sending...
2019-08-21 12:00:21,428 - synapse.http.matrixfederationclient - 404 - INFO - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Sending request: PUT matrix://matrix.org/_matrix/federation/v1/send/1566201506386; timeout 60.000000s
2019-08-21 12:01:21,431 - synapse.http.matrixfederationclient - 428 - INFO - federation_transaction_transmission_loop-1873- Failed to send request: (60, 'Deferred')
2019-08-21 12:01:21,431 - synapse.http.matrixfederationclient - 485 - WARNING - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Request failed: PUT matrix://matrix.org/_matrix/federation/v1/send/1566201506386: TimeoutError(60, 'Deferred')
2019-08-21 12:01:21,432 - synapse.http.matrixfederationclient - 505 - DEBUG - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Waiting 65.60553046220608s before re-sending...
2019-08-21 12:02:27,040 - synapse.http.matrixfederationclient - 404 - INFO - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Sending request: PUT matrix://matrix.org/_matrix/federation/v1/send/1566201506386; timeout 60.000000s
2019-08-21 12:03:27,044 - synapse.http.matrixfederationclient - 428 - INFO - federation_transaction_transmission_loop-1873- Failed to send request: (60, 'Deferred')
2019-08-21 12:03:27,044 - synapse.http.matrixfederationclient - 485 - WARNING - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Request failed: PUT matrix://matrix.org/_matrix/federation/v1/send/1566201506386: TimeoutError(60, 'Deferred')
2019-08-21 12:03:27,044 - synapse.http.matrixfederationclient - 505 - DEBUG - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Waiting 64.91518843273639s before re-sending...
2019-08-21 12:04:31,961 - synapse.http.matrixfederationclient - 404 - INFO - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Sending request: PUT matrix://matrix.org/_matrix/federation/v1/send/1566201506386; timeout 60.000000s
2019-08-21 12:05:12,390 - synapse.http.matrixfederationclient - 428 - INFO - federation_transaction_transmission_loop-1873- Failed to send request: 
2019-08-21 12:05:12,390 - synapse.http.matrixfederationclient - 485 - WARNING - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Request failed: PUT matrix://matrix.org/_matrix/federation/v1/send/1566201506386: CancelledError()
2019-08-21 12:05:12,390 - synapse.http.matrixfederationclient - 505 - DEBUG - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Waiting 83.23851162831703s before re-sending...
2019-08-21 12:06:35,630 - synapse.http.matrixfederationclient - 404 - INFO - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Sending request: PUT matrix://matrix.org/_matrix/federation/v1/send/1566201506386; timeout 60.000000s
2019-08-21 12:07:14,264 - synapse.http.matrixfederationclient - 428 - INFO - federation_transaction_transmission_loop-1873- Failed to send request: 
2019-08-21 12:07:14,264 - synapse.http.matrixfederationclient - 485 - WARNING - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Request failed: PUT matrix://matrix.org/_matrix/federation/v1/send/1566201506386: CancelledError()
2019-08-21 12:07:14,265 - synapse.http.matrixfederationclient - 505 - DEBUG - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Waiting 48.31573290159327s before re-sending...
2019-08-21 12:08:02,583 - synapse.http.matrixfederationclient - 404 - INFO - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Sending request: PUT matrix://matrix.org/_matrix/federation/v1/send/1566201506386; timeout 60.000000s
2019-08-21 12:08:16,062 - synapse.http.matrixfederationclient - 428 - INFO - federation_transaction_transmission_loop-1873- Failed to send request: 
2019-08-21 12:08:16,063 - synapse.http.matrixfederationclient - 485 - WARNING - federation_transaction_transmission_loop-1873- {PUT-O-1305} [matrix.org] Request failed: PUT matrix://matrix.org/_matrix/federation/v1/send/1566201506386: CancelledError()
2019-08-21 12:08:16,063 - synapse.util.retryutils - 212 - INFO - federation_transaction_transmission_loop-1873- Connection to matrix.org was unsuccessful (<class 'synapse.api.errors.RequestSendFailed'>(Failed to send request: CancelledError: )); backoff now 3000000
2019-08-21 12:08:16,064 - synapse.federation.sender.per_destination_queue - 318 - WARNING - federation_transaction_transmission_loop-1873- TX [matrix.org] Failed to send transaction: Failed to send request: CancelledError: 
2019-08-21 12:08:16,064 - synapse.storage.txn - 408 - DEBUG - federation_transaction_transmission_loop-1873- [TXN START] {set_destination_retry_timings-d58f}
2019-08-21 12:08:16,065 - synapse.storage.SQL - 164 - DEBUG - federation_transaction_transmission_loop-1873- [SQL] {set_destination_retry_timings-d58f} INSERT INTO destinations (destination, retry_last_ts, retry_interval) VALUES (?, ?, ?) ON CONFLICT (destination) DO UPDATE SET retry_last_ts = EXCLUDED.retry_last_ts, retry_interval = EXCLUDED.retry_interval WHERE EXCLUDED.retry_interval = 0 OR destinations.retry_interval < EXCLUDED.retry_interval
2019-08-21 12:08:16,065 - synapse.storage.SQL - 169 - DEBUG - federation_transaction_transmission_loop-1873- [SQL values] {set_destination_retry_timings-d58f} ('matrix.org', 1566382096063, 3000000)
2019-08-21 12:08:16,067 - synapse.storage.SQL - 183 - DEBUG - federation_transaction_transmission_loop-1873- [SQL time] {set_destination_retry_timings-d58f} 0.000747 sec
2019-08-21 12:08:16,081 - synapse.storage.txn - 470 - DEBUG - federation_transaction_transmission_loop-1873- [TXN END] {set_destination_retry_timings-d58f} 0.017097 sec
2019-08-21 12:08:16,082 - synapse.logging.context - 500 - DEBUG - federation_transaction_transmission_loop-1873- Restoring dead context: federation_transaction_transmission_loop-1873
jans23 commented 5 years ago

@richvdh Is there anything we could help you with solving this issue?

richvdh commented 5 years ago

That's odd. I could have sworn I'd replied here.

It looks to me like the DNS lookups might be timing out. Please could you show the results of:

dig -t SRV _matrix._tcp.matrix.org
benksio commented 5 years ago

dig -t SRV _matrix._tcp.matrix.org


; <<>> DiG 9.10.3-P4-Debian <<>> -t SRV _matrix._tcp.matrix.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 24833
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;_matrix._tcp.matrix.org.   IN  SRV

;; ANSWER SECTION:
_matrix._tcp.matrix.org. 220    IN  SRV 10 5 8443 matrix.org.cdn.cloudflare.net.

;; Query time: 0 msec
;; SERVER: fe80::3285:a9ff:fea6:1ee7%65#53(fe80::3285:a9ff:fea6:1ee7%65)
;; WHEN: Tue Aug 27 21:41:39 CEST 2019
;; MSG SIZE  rcvd: 101
richvdh commented 5 years ago

That looks fine, though I'm wondering if the link-local IPv6 DNS server could be causing problems. Are you able to test using a DNS server on an IPv4 address or a global IPv6 address?

benksio commented 5 years ago

I configured the server to use an IPv4 DNS Server. Although I think the former one is not guilty.

dig -t SRV _matrix._tcp.matrix.org

; <<>> DiG 9.10.3-P4-Debian <<>> -t SRV _matrix._tcp.matrix.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 53627
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;_matrix._tcp.matrix.org.   IN  SRV

;; ANSWER SECTION:
_matrix._tcp.matrix.org. 12 IN  SRV 10 5 8443 matrix.org.cdn.cloudflare.net.

;; Query time: 12 msec
;; SERVER: 9.9.9.9#53(9.9.9.9)
;; WHEN: Fri Sep 06 10:33:38 CEST 2019
;; MSG SIZE  rcvd: 101

Can we provide any other information, to figure out why federation doesn't work?

richvdh commented 5 years ago

I configured the server to use an IPv4 DNS Server

And you're still having the same problem? Do the logs look the same?

jans23 commented 5 years ago

Now for the first time I can send messages between both instances. However at matrix.org I experienced several "server error" in Riot and I had to try resending several times until my message got through to nitro.chat. The other direction (from nitro.chat to matrix.org) didn't had this issue. It feels a bit shaky so I'm not sure this issue is entirely solved or not. @benksio A look in the log might help.

richvdh commented 5 years ago

sounds like a different problem then. I suggest seeking support in #synapse:matrix.org in the first instance, and if you find a bug, raising a new issue with the new symptoms.

richvdh commented 5 years ago

Have raised this as a bug in twisted at https://twistedmatrix.com/trac/ticket/9704.

benksio commented 5 years ago

Thx for your hints. The Issue seems to be solved now by having changed the DNS Server.

jans23 commented 5 years ago

IMHO we just found a workaround and the actual (implementation) issue isn't solved. In this case this issue ticket should remain open, right?

sudwhiwdh commented 4 years ago

@richvdh Hello, what does it take to solve this issue?

pranaysashank commented 4 years ago

I have the same issue and my dns server is on ipv4 not ipv6

richvdh commented 4 years ago

@richvdh Hello, what does it take to solve this issue?

a fix to the upstream issue. Follow the links; maybe try the suggested fix.

I have the same issue and my dns server is on ipv4 not ipv6

if your dns server isn't on a link-local IPv6 address, it is a different issue.