Closed Dominik1999 closed 4 years ago
Have seen it again:
MFEE3 failed last night with Payment couldn't be completed because: there is no route available
@karlb @palango
The error is NoRouteFound('Source not online')"
, so it looks like we're having problems with the presence handling.
The error is
NoRouteFound('Source not online')"
, so it looks like we're having problems with the presence handling.
Damn, and I just removed the presence logging yesterday: https://github.com/raiden-network/raiden-services/pull/679
@karlb can you add to the error message Source not online but payment received - stupid source!
The PFS has not received a ChannelOpened event for the potential initiator, so it does not track its presence and assumes that it's offline when it requests a route.
so that does not sound matrix related, right?
Yes, that's not matrix related. It looks like we're skipping a block whenever a new channel is created. Should be reproducible in a unit test. I've opened https://github.com/raiden-network/raiden/issues/5443 for this problem.
@Dominik1999 to add logs from friday afternoon
Logs from Fri 6th of December 15:00h CET @karlb
in line 201 in
{"task": "<SerialTask: {'name': 'Make 10 payments from 0 to 3 after restart'
we got
{\"errors\": \"Payment couldn't be completed because: there is no route available\"}"}
in the second transfer task from Node 0 to Node 3
{"task": "<TransferTask: {'from': 0, 'to': 3, 'amount': 1000000000000000, 'lock_timeout': 30}>"
On 7th December 2019 The BF4 scenario fails has a No route available
failure
{
"task": "<TransferTask: {'from': 0, 'to': 3, 'amount': 1000000000000000, 'lock_timeout': 30}>",
"event": "Task errored",
"logger": "scenario_player.tasks.base",
"level": "error",
"timestamp": "2019-12-07 02:01:41.228116",
"exception": "Traceback (most recent call last):\n File \"/usr/local/lib/python3.7/site-packages/scenario_player/tasks/base.py\", line 74, in __call__\n return self._run(*args, **kwargs)\n File \"/usr/local/lib/python3.7/site-packages/scenario_player/tasks/api_base.py\", line 59, in _run\n f'HTTP status code \"{resp.status_code}\" while fetching {url}. '\nscenario_player.exceptions.legacy.RESTAPIStatusMismatchError: HTTP status code \"409\" while fetching http://127.0.0.1:55899/api/v1/payments/0x62083c80353Df771426D209eF578619EE68D5C7A/0x0B336fB2F7E68A316dDE5E08a00a7c3Fb4C426b3. Expected 2..: {\"errors\": \"Payment couldn't be completed because: there is no route available\"}"
}
what exception say:
Payment couldn't be completed because: there is no route available
while fetching request
http://127.0.0.1:55899/api/v1/payments/0x62083c80353Df771426D209eF578619EE68D5C7A/0x0B336fB2F7E68A316dDE5E08a00a7c3Fb4C426b3
- transfer: {from: 0, to: 3, amount: 1_000_000_000_000_000, lock_timeout: 30}
https://drive.google.com/drive/folders/13DV21NX3X_vKUOpRIobhQUA0w8sx1vs6?usp=sharing
Both of the two cases directly above are caused by NoRouteFound('Source does not have a channel with sufficient capacity')
. This message is surprising, considering that the previously processed CapacityUpdates look good (output from the PFS log for @agatsoh's BF4 error):
{
"message": {
"signature": "0x022c0503436d6d9a636c1272421bef60ed092abba32699c70cb021b5e1af848f758b1cef7751e2b8be3d18dae14c943da962df27993e837f8161f785155276711b",
"canonical_identifier": {
"chain_identifier": 5,
"token_network_address": "0xDa1fBc048f503635950058953f5c60FC1F564ee6",
"channel_identifier": 5090
},
"updating_participant": "0x512776974e648921Fa2Ccd829E1f29389560fd28",
"other_participant": "0xD2298fd940C67Ae209cA504825f0D5C6e8261ADf",
"updating_nonce": 0,
"other_nonce": 0,
"updating_capacity": 1e+18,
"other_capacity": 1e+18,
"reveal_timeout": 20,
"type_name": "PFSCapacityUpdate"
},
"event": "Received Capacity Update",
"level": "debug",
"logger": "pathfinding_service.service",
"timestamp": "2019-12-07 01:59:57.076776"
}
{
"message": {
"signature": "0x8596a8fa3ee973e298d2e6e9bdeebaa5e179658322bd7a747133363501a41bfa1594f1025a26d6c650e7abe45afcc9e435385554934d54208660c1e9333c60731c",
"canonical_identifier": {
"chain_identifier": 5,
"token_network_address": "0xDa1fBc048f503635950058953f5c60FC1F564ee6",
"channel_identifier": 5090
},
"updating_participant": "0xD2298fd940C67Ae209cA504825f0D5C6e8261ADf",
"other_participant": "0x512776974e648921Fa2Ccd829E1f29389560fd28",
"updating_nonce": 0,
"other_nonce": 0,
"updating_capacity": 1e+18,
"other_capacity": 1e+18,
"reveal_timeout": 20,
"type_name": "PFSCapacityUpdate"
},
"event": "Received Capacity Update",
"level": "debug",
"logger": "pathfinding_service.service",
"timestamp": "2019-12-07 01:59:57.846347"
}
For BF1 I see:
{
"event": "Pathfinding Service can not find a route: Target not online (2201).",
"logger": "raiden.network.pathfinding",
"level": "info",
"timestamp": "2019-12-10 23:39:45.594536"
}
Edit:
The target here is 0xD2dE5af1c5Bc99837A70f09B31bd339de24b739D
. This address goes offline
shortly before the request comes in.
{"current_user": "@0x062c12c01d0f17fc9eaa33940d994594d91a0182:transport02.raiden.network", "node": "0x062C12c01D0f17fC9eAa33940D994594d91a0182", "user_id": "@0xd2de5af1c5bc99837a70f09b31bd339de24b739d:transport04.raiden.network", "prev_state": "<UserPresence.ONLINE: 'online'>", "state": "<UserPresence.OFFLINE: 'offline'>", "event": "Changing user presence state", "level": "debug", "logger": "raiden.network.transport.matrix.utils", "timestamp": "2019-12-10 23:38:24.759751"}
{"current_user": "@0x062c12c01d0f17fc9eaa33940d994594d91a0182:transport02.raiden.network", "node": "0x062C12c01D0f17fC9eAa33940D994594d91a0182", "address": "0xD2dE5af1c5Bc99837A70f09B31bd339de24b739D", "prev_state": "<AddressReachability.REACHABLE: 1>", "state": "<AddressReachability.UNREACHABLE: 2>", "event": "Changing address reachability state", "level": "debug", "logger": "raiden.network.transport.matrix.utils", "timestamp": "2019-12-10 23:38:24.760319"}
{"from_block": 1800075, "end_block": 1800075, "event": "Querying new block(s)", "level": "info", "logger": "raiden_libs.blockchain", "timestamp": "2019-12-10 23:38:35.133549"}
{"latest_committed_block": 1800075, "event": "Updating latest_committed_block", "level": "info", "logger": "pathfinding_service.database", "timestamp": "2019-12-10 23:38:35.900344"}
{"from_block": 1800075, "to_block": 1800075, "getting": 0.77, "processing": 0.0, "total_duration": 0.77, "event_counts": {"UpdatedHeadBlockEvent": 1}, "event": "Processed blocks", "level": "info", "logger": "pathfinding_service.service", "timestamp": "2019-12-10 23:38:35.902642"}
{"node": "0x062C12c01D0f17fC9eAa33940D994594d91a0182", "first_sync": false, "sync_token": "s676746_549755_0_28_447_1_1_40080_1", "current_user": "@0x062c12c01d0f17fc9eaa33940d994594d91a0182:transport02.raiden.network", "event": "Starting handle greenlet", "level": "debug", "logger": "raiden.network.transport.matrix.client", "timestamp": "2019-12-10 23:38:44.957015"}
{"node": "0x062C12c01D0f17fC9eAa33940D994594d91a0182", "current_user": "@0x062c12c01d0f17fc9eaa33940d994594d91a0182:transport02.raiden.network", "event": "Sync called", "level": "debug", "logger": "raiden.network.transport.matrix.client", "timestamp": "2019-12-10 23:38:44.957663"}
{"from_block": 1800076, "end_block": 1800076, "event": "Querying new block(s)", "level": "info", "logger": "raiden_libs.blockchain", "timestamp": "2019-12-10 23:38:49.966207"}
{"latest_committed_block": 1800076, "event": "Updating latest_committed_block", "level": "info", "logger": "pathfinding_service.database", "timestamp": "2019-12-10 23:38:50.770848"}
{"from_block": 1800076, "to_block": 1800076, "getting": 0.8, "processing": 0.0, "total_duration": 0.81, "event_counts": {"UpdatedHeadBlockEvent": 1}, "event": "Processed blocks", "level": "info", "logger": "pathfinding_service.service", "timestamp": "2019-12-10 23:38:50.773149"}
{"from_block": 1800077, "end_block": 1800077, "event": "Querying new block(s)", "level": "info", "logger": "raiden_libs.blockchain", "timestamp": "2019-12-10 23:39:04.838924"}
{"node": "0x062C12c01D0f17fC9eAa33940D994594d91a0182", "first_sync": false, "sync_token": "s676746_549755_0_28_447_1_1_40080_1", "current_user": "@0x062c12c01d0f17fc9eaa33940d994594d91a0182:transport02.raiden.network", "event": "Starting handle greenlet", "level": "debug", "logger": "raiden.network.transport.matrix.client", "timestamp": "2019-12-10 23:39:05.153201"}
{"node": "0x062C12c01D0f17fC9eAa33940D994594d91a0182", "current_user": "@0x062c12c01d0f17fc9eaa33940d994594d91a0182:transport02.raiden.network", "event": "Sync called", "level": "debug", "logger": "raiden.network.transport.matrix.client", "timestamp": "2019-12-10 23:39:05.153854"}
{"latest_committed_block": 1800077, "event": "Updating latest_committed_block", "level": "info", "logger": "pathfinding_service.database", "timestamp": "2019-12-10 23:39:05.703277"}
{"from_block": 1800077, "to_block": 1800077, "getting": 0.86, "processing": 0.0, "total_duration": 0.87, "event_counts": {"UpdatedHeadBlockEvent": 1}, "event": "Processed blocks", "level": "info", "logger": "pathfinding_service.service", "timestamp": "2019-12-10 23:39:05.704868"}
{"from_block": 1800078, "end_block": 1800078, "event": "Querying new block(s)", "level": "info", "logger": "raiden_libs.blockchain", "timestamp": "2019-12-10 23:39:21.778351"}
{"latest_committed_block": 1800078, "event": "Updating latest_committed_block", "level": "info", "logger": "pathfinding_service.database", "timestamp": "2019-12-10 23:39:22.533816"}
{"from_block": 1800078, "to_block": 1800078, "getting": 0.76, "processing": 0.0, "total_duration": 0.76, "event_counts": {"UpdatedHeadBlockEvent": 1}, "event": "Processed blocks", "level": "info", "logger": "pathfinding_service.service", "timestamp": "2019-12-10 23:39:22.535837"}
{"node": "0x062C12c01D0f17fC9eAa33940D994594d91a0182", "first_sync": false, "sync_token": "s676746_549762_0_28_447_1_1_40080_1", "current_user": "@0x062c12c01d0f17fc9eaa33940d994594d91a0182:transport02.raiden.network", "event": "Starting handle greenlet", "level": "debug", "logger": "raiden.network.transport.matrix.client", "timestamp": "2019-12-10 23:39:25.364018"}
{"node": "0x062C12c01D0f17fC9eAa33940D994594d91a0182", "current_user": "@0x062c12c01d0f17fc9eaa33940d994594d91a0182:transport02.raiden.network", "event": "Sync called", "level": "debug", "logger": "raiden.network.transport.matrix.client", "timestamp": "2019-12-10 23:39:25.364427"}
{"from_block": 1800079, "end_block": 1800079, "event": "Querying new block(s)", "level": "info", "logger": "raiden_libs.blockchain", "timestamp": "2019-12-10 23:39:34.594634"}
{"latest_committed_block": 1800079, "event": "Updating latest_committed_block", "level": "info", "logger": "pathfinding_service.database", "timestamp": "2019-12-10 23:39:35.376223"}
{"from_block": 1800079, "to_block": 1800079, "getting": 0.78, "processing": 0.0, "total_duration": 0.78, "event_counts": {"UpdatedHeadBlockEvent": 1}, "event": "Processed blocks", "level": "info", "logger": "pathfinding_service.service", "timestamp": "2019-12-10 23:39:35.378328"}
[36mpfs-goerli-with-fee_1 |[0m 172.21.0.17 - - [2019-12-10 23:39:45] "GET /api/v1/0xDa1fBc048f503635950058953f5c60FC1F564ee6/payment/iou?sender=0xf2c690389111a499E811C2d4701a0d1bF3f6324E&receiver=0x062C12c01D0f17fC9eAa33940D994594d91a0182×tamp=2019-12-10T23%3A39%3A45&signature=0xa14fff77e9188adb5642097e51b02432bdc641102aa10042bf436d40e8bb7a432e8899448db8943bca72f959c496922fc018cecc591c4a46e6ab5270773ed6231b HTTP/1.1" 200 654 0.005259
{"node": "0x062C12c01D0f17fC9eAa33940D994594d91a0182", "first_sync": false, "sync_token": "s676746_549762_0_28_447_1_1_40080_1", "current_user": "@0x062c12c01d0f17fc9eaa33940d994594d91a0182:transport02.raiden.network", "event": "Starting handle greenlet", "level": "debug", "logger": "raiden.network.transport.matrix.client", "timestamp": "2019-12-10 23:39:45.568061"}
{"node": "0x062C12c01D0f17fC9eAa33940D994594d91a0182", "current_user": "@0x062c12c01d0f17fc9eaa33940d994594d91a0182:transport02.raiden.network", "event": "Sync called", "level": "debug", "logger": "raiden.network.transport.matrix.client", "timestamp": "2019-12-10 23:39:45.568462"}
{"sender": "0xf2c690389111a499E811C2d4701a0d1bF3f6324E", "expected_amount": 10600, "total_amount": 10600, "added_amount": 10500, "event": "Received service fee", "level": "info", "logger": "pathfinding_service.api", "timestamp": "2019-12-10 23:39:45.583502"}
{"error": "NoRouteFound('Target not online')", "details": {"from_": "0xf2c690389111a499E811C2d4701a0d1bF3f6324E", "to": "0xD2dE5af1c5Bc99837A70f09B31bd339de24b739D", "value": 1000000000000000}, "message": "Target not online", "event": "Error while handling request", "level": "warning", "logger": "pathfinding_service.api", "timestamp": "2019-12-10 23:39:45.587017"}
For BF2:
{
"event": "Pathfinding Service can not find a route: No route between nodes found. (2201).",
"logger": "raiden.network.pathfinding",
"level": "info",
"timestamp": "2019-12-11 00:25:23.166211"
}
Edit:
More infos: Theo nodestwo nodes go:
"node": 0, "address": "0x0Ed0a3F4a3B95b3f1118eCca2E856BcD8cb632c8"
"node": 1, "address": "0x262582E09B7Db064623F99F91d22f2fFF31e3EC8"
"node": 2, "address": "0x0D6a6017f2e98fF455a9ff299550FF4aAAeAFdBf"
"node": 3, "address": "0x8F7cbBA93Ee46adee494e2FfbFDe7b731a99fB4f"
"node": 4, "address": "0xc531aF2895498d16627C954240E59896b486048f"
Here again, two nodes change the presence short before the request:
[36mpfs-goerli_1 |[0m {"current_user": "@0x062c12c01d0f17fc9eaa33940d994594d91a0182:transport04.raiden.network", "node": "0x062C12c01D0f17fC9eAa33940D994594d91a0182", "user_id": "@0x0ed0a3f4a3b95b3f1118ecca2e856bcd8cb632c8:transport04.raiden.network", "prev_state": null, "state": "<UserPresence.ONLINE: 'online'>", "event": "Changing user presence state", "level": "debug", "logger": "raiden.network.transport.matrix.utils", "timestamp": "2019-12-10 23:43:20.923612"}
[36mpfs-goerli_1 |[0m {"current_user": "@0x062c12c01d0f17fc9eaa33940d994594d91a0182:transport04.raiden.network", "node": "0x062C12c01D0f17fC9eAa33940D994594d91a0182", "user_id": "@0x262582e09b7db064623f99f91d22f2fff31e3ec8:transport04.raiden.network", "prev_state": "<UserPresence.ONLINE: 'online'>", "state": "<UserPresence.OFFLINE: 'offline'>", "event": "Changing user presence state", "level": "debug", "logger": "raiden.network.transport.matrix.utils", "timestamp": "2019-12-11 00:23:39.477808"}
[36mpfs-goerli_1 |[0m {"current_user": "@0x062c12c01d0f17fc9eaa33940d994594d91a0182:transport04.raiden.network", "node": "0x062C12c01D0f17fC9eAa33940D994594d91a0182", "user_id": "@0x0d6a6017f2e98ff455a9ff299550ff4aaaeafdbf:transport04.raiden.network", "prev_state": null, "state": "<UserPresence.ONLINE: 'online'>", "event": "Changing user presence state", "level": "debug", "logger": "raiden.network.transport.matrix.utils", "timestamp": "2019-12-10 23:43:20.830678"}
[36mpfs-goerli_1 |[0m {"current_user": "@0x062c12c01d0f17fc9eaa33940d994594d91a0182:transport04.raiden.network", "node": "0x062C12c01D0f17fC9eAa33940D994594d91a0182", "user_id": "@0x8f7cbba93ee46adee494e2ffbfde7b731a99fb4f:transport04.raiden.network", "prev_state": null, "state": "<UserPresence.ONLINE: 'online'>", "event": "Changing user presence state", "level": "debug", "logger": "raiden.network.transport.matrix.utils", "timestamp": "2019-12-10 23:43:20.884757"}
[36mpfs-goerli_1 |[0m {"current_user": "@0x062c12c01d0f17fc9eaa33940d994594d91a0182:transport04.raiden.network", "node": "0x062C12c01D0f17fC9eAa33940D994594d91a0182", "user_id": "@0xc531af2895498d16627c954240e59896b486048f:transport04.raiden.network", "prev_state": null, "state": "<UserPresence.ONLINE: 'online'>", "event": "Changing user presence state", "level": "debug", "logger": "raiden.network.transport.matrix.utils", "timestamp": "2019-12-10 23:53:35.141397"}
For BF6:
{"error": "ServiceRequestIOURejected('The provided payment is lower than service fee. (2104)')", "event": "Pathfinding Service rejected IOU", "logger": "raiden.network.pathfinding", "level": "debug", "timestamp": "2019-12-11 00:34:11.054577"}
{"log_message": "The provided payment is lower than service fee. (2104)", "event": "An error with the path request occured", "logger": "raiden.routing", "level": "warning", "timestamp": "2019-12-11 00:34:11.055220"}
All I find here is:
[36mpfs-goerli-with-fee_1 |[0m 172.21.0.17 - - [2019-12-11 00:34:11] "POST /api/v1/0xDa1fBc048f503635950058953f5c60FC1F564ee6/paths HTTP/1.1" 200 486 0.257421
[36mpfs-goerli-with-fee_1 |[0m {"error": "InsufficientServicePayment(None)", "details": {"expected_amount": 400}, "message": "The provided payment is lower than service fee.", "event": "Error while handling request", "level": "warning", "logger": "pathfinding_service.api", "timestamp": "2019-12-11 00:34:11.049139"}
The server-side presence monitoring shows this:
@0xd2de5af1c5bc99837a70f09b31bd339de24b739d:transport04.raiden.network
transport01
:
2019-12-10 22:03:12.563158
: <UserPresence.OFFLINE: 'offline'>
2019-12-10 22:03:30.606891
: <UserPresence.ONLINE: 'online'>
2019-12-10 23:38:24.389990
: <UserPresence.OFFLINE: 'offline'>
transport02
:
2019-12-10 22:03:12.605506
: <UserPresence.OFFLINE: 'offline'>
2019-12-10 22:03:28.030473
: <UserPresence.ONLINE: 'online'>
2019-12-10 23:38:24.673363
: <UserPresence.OFFLINE: 'offline'>
transport03
:
2019-12-10 22:03:11.255074
: <UserPresence.OFFLINE: 'offline'>
2019-12-10 22:03:28.593282
: <UserPresence.ONLINE: 'online'>
2019-12-10 23:38:24.034341
: <UserPresence.OFFLINE: 'offline'>
transport04
:
2019-12-10 22:03:10.666433
: <UserPresence.OFFLINE: 'offline'>
2019-12-10 22:03:27.887791
: <UserPresence.ONLINE: 'online'>
2019-12-10 23:38:23.910300
: <UserPresence.OFFLINE: 'offline'>
So this does not look like a return of the presence issue since all servers agree about the online state and see the correct state update within ~2s of each other.
(The log files were truncated on transport02 and 04, I assume because of the disk full issue earlier that week, but the docker compose logs still had the relevant information.)
So this does not look like a return of the presence issue since all servers agree about the online state and see the correct state update within ~2s of each other.
With presence issue I meant more that we're seeing online nodes (hopefully) as offline. And nice to know that this is consistent among the servers.
What's interesting about BF2:
The last sync
call, was at 00:25:14, which is 9secs before the PFS request.
{
"node": "0x0Ed0a3F4a3B95b3f1118eCca2E856BcD8cb632c8",
"first_sync": false,
"sync_token": "s1370554_915524_0_35_973_5_1_32899_1",
"current_user": "@0x0ed0a3f4a3b95b3f1118ecca2e856bcd8cb632c8:transport04.raiden.network",
"presence_events_qty": 14,
"to_device_events_qty": 0,
"rooms_invites_qty": 0,
"rooms_leaves_qty": 0,
"rooms_joined_member_count": 0,
"rooms_invited_member_count": 0,
"rooms_join_state_qty": 0,
"rooms_join_state_events_qty": 0,
"rooms_join_ephemeral_events_qty": 0,
"rooms_join_account_data_events_qty": 0,
"event": "Starting _handle_response",
"logger": "raiden.network.transport.matrix.client",
"level": "debug",
"greenlet_name": "GMatrixClient.listen_forever user_id:@0x0ed0a3f4a3b95b3f1118ecca2e856bcd8cb632c8:transport04.raiden.network",
"timestamp": "2019-12-11 00:25:14.524498"
}
{
"node": "0x0Ed0a3F4a3B95b3f1118eCca2E856BcD8cb632c8",
"current_user": "@0x0ed0a3f4a3b95b3f1118ecca2e856bcd8cb632c8:transport04.raiden.network",
"event": "Sync called",
"logger": "raiden.network.transport.matrix.client",
"level": "debug",
"greenlet_name": "GMatrixClient.listen_forever user_id:@0x0ed0a3f4a3b95b3f1118ecca2e856bcd8cb632c8:transport04.raiden.network",
"timestamp": "2019-12-11 00:25:14.525826"
}
This issue cannot be worked on atm. We first fix #5498 and see if it happens again
Meta Issue for the Symptom "No route found"
Recently we have seen some scenarios failing with the message
Payment couldn't be completed because: there is no route available
. This is a Meta Issue to track the Symptom and to be able to close all related issues to that.It does not make sense to keep those issues open, since we try to reproduce the issues to get fresh logs together with the pfs logs. Without the corresponding pfs logs, we cannot analyze further.
Issue checklist
To help us more easily debug your issue please check the boxes for which requirements for safe usage or Raiden were followed: