Closed agatsoh closed 3 years ago
I found this error, I'm just surprised that an expired transfer should crash the node:
127.0.0.1 - - "POST /api/v1/payments/0x59105441977ecD9d805A4f5b060E34676F50F806/0x5409D2B144a450C459A86c6352898Ef759EdE198 HTTP/1.1" 500 742 "-" "python-requests/2.24.0"
InternalServerError: Transfer expired.
at Object.internalErrorHandler (/lc/index.js:222762:31)
at Layer.handle_error (/lc/index.js:47797:5)
at trim_prefix (/lc/index.js:47371:13)
at /lc/index.js:47340:7
at Function.process_params (/lc/index.js:47391:12)
at next (/lc/index.js:47331:10)
at Layer.handle_error (/lc/index.js:47793:12)
at trim_prefix (/lc/index.js:47371:13)
at /lc/index.js:47340:7
at Function.process_params (/lc/index.js:47391:12)
There's also this higher up, which I cannot decypher:
2020-11-04T01:48:21.228Z [error] Error sending message to global room { roomName: 'raiden_goerli_path_finding' } {
type: 'PFSCapacityUpdate',
canonical_identifier: {
chain_identifier: BigNumber { _hex: '0x05' },
token_network_address: '0x3EA2a1fED7FdEf300DA19E97092Ce8FdF8bf66A3',
channel_identifier: BigNumber { _hex: '0x25ce' }
},
updating_participant: '0x421231BADBc8aE3aC3bA340E8d00a5ca129c6257',
other_participant: '0x0b49e040F03a329C8a2e406190a9A9647166c2E0',
updating_nonce: BigNumber { _hex: '0x03' },
other_nonce: BigNumber { _hex: '0x00' },
updating_capacity: BigNumber { _hex: '0x0dd99bb65dd70000' },
other_capacity: BigNumber { _hex: '0x038d7ea4c68000' },
reveal_timeout: BigNumber { _hex: '0x14' },
signature: '0x14994a5654c77acaa899167facfd56b5f25bdd13fc27b887c015ddfd3002b255661b43cf80342406757800b717682d1ff26b8d76c529451072dd9dd0e531b0131c'
} <ref *1> M_UNKNOWN: Internal server error
at parseErrorResponse (/lc/index.js:110946:13)
at /lc/index.js:110901:17
at Request._callback (/lc/index.js:110835:9)
at Request.self.callback (/lc/index.js:166091:22)
at Request.emit (events.js:315:20)
at Request.<anonymous> (/lc/index.js:167060:10)
at Request.emit (events.js:315:20)
at IncomingMessage.<anonymous> (/lc/index.js:166982:12)
at Object.onceWrapper (events.js:421:28)
at IncomingMessage.emit (events.js:327:22) {
errcode: 'M_UNKNOWN',
data: { errcode: 'M_UNKNOWN', error: 'Internal server error' },
httpStatus: 500,
event: MatrixEvent {
event: {
type: 'm.room.message',
content: {
body: '{"type":"PFSCapacityUpdate","canonical_identifier":{"chain_identifier":"5","token_network_address":"0x3EA2a1fED7FdEf300DA19E97092Ce8FdF8bf66A3","channel_identifier":"9678"},"updating_participant":"0x421231BADBc8aE3aC3bA340E8d00a5ca129c6257","other_participant":"0x0b49e040F03a329C8a2e406190a9A9647166c2E0","updating_nonce":"3","other_nonce":"0","updating_capacity":"998000000000000000","other_capacity":"1000000000000000","reveal_timeout":"20","signature":"0x14994a5654c77acaa899167facfd56b5f25bdd13fc27b887c015ddfd3002b255661b43cf80342406757800b717682d1ff26b8d76c529451072dd9dd0e531b0131c"}',
msgtype: 'm.text'
},
event_id: '~!itpaijJrCycYDeCyPj:transport.transport01.raiden.network:m1604454460913.12',
user_id: '@0x421231badbc8ae3ac3ba340e8d00a5ca129c6257:transport.transport01.raiden.network',
sender: '@0x421231badbc8ae3ac3ba340e8d00a5ca129c6257:transport.transport01.raiden.network',
room_id: '!itpaijJrCycYDeCyPj:transport.transport01.raiden.network',
origin_server_ts: 1604454460913
},
sender: RoomMember {
roomId: '!itpaijJrCycYDeCyPj:transport.transport01.raiden.network',
userId: '@0x421231badbc8ae3ac3ba340e8d00a5ca129c6257:transport.transport01.raiden.network',
typing: false,
name: '@0x421231badbc8ae3ac3ba340e8d00a5ca129c6257:transport.transport01.raiden.network',
rawDisplayName: '@0x421231badbc8ae3ac3ba340e8d00a5ca129c6257:transport.transport01.raiden.network',
powerLevel: 0,
powerLevelNorm: 0,
user: null,
membership: null,
events: { member: null },
_isOutOfBand: false,
_modified: 1604454221442
},
target: null,
status: 'not_sent',
error: [Circular *1],
forwardLooking: true,
_pushActions: null,
_replacingEvent: null,
_localRedactionEvent: null,
_isCancelled: false,
_clearEvent: {},
_senderCurve25519Key: null,
_claimedEd25519Key: null,
_forwardingCurve25519KeyChain: [],
_untrusted: null,
_decryptionPromise: null,
_retryDecryption: false,
verificationRequest: null,
_txnId: 'm1604454460913.12'
}
}
Ok.. this is a warning, not a fatal error, it's catched and doesn't shutdown the SDK..
BUT it does indicate that there was an issue on transport server, which didn't get this message through, and this is probably why something else failed (e.g. another transfer didn't go through because the PFS's view of the capacities were outdated).
We need both to check on retrying this request (probably with exponentialBackoff), since we global room message sending isn't expected to fail forever, and also to check what else failed because of it and was not properly handled to end up in a 500
error, instead of a 402 or 409 one.
Reading the source, I can see the 500 was caused by Transfer expiration, as mentioned. Transfer expiration needs to be handled as a failure and raise a proper error code. Besides this, it did fail because on the recipient's side, once again, the matrix server misbehaved when we tried to fetch presence for the initiator, in order to request the secret:
2020-11-04T01:47:49.905Z [info] %c action color: #03A9F4; font-weight: bold {
type: 'matrix/presence/failure',
payload: M_UNKNOWN: Internal server error
at parseErrorResponse (/lc/index.js:110946:13)
at /lc/index.js:110901:17
at Request._callback (/lc/index.js:110835:9)
at Request.self.callback (/lc/index.js:166091:22)
at Request.emit (events.js:315:20)
at Request.<anonymous> (/lc/index.js:167060:10)
at Request.emit (events.js:315:20)
at IncomingMessage.<anonymous> (/lc/index.js:166982:12)
at Object.onceWrapper (events.js:421:28)
at IncomingMessage.emit (events.js:327:22) {
errcode: 'M_UNKNOWN',
data: { errcode: 'M_UNKNOWN', error: 'Internal server error' },
httpStatus: 500
},
meta: { address: '0x421231BADBc8aE3aC3bA340E8d00a5ca129c6257' },
error: true
}
/sync error M_UNKNOWN: Internal server error
at parseErrorResponse (/lc/index.js:110946:13)
at /lc/index.js:110901:17
at Request._callback (/lc/index.js:110835:9)
at Request.self.callback (/lc/index.js:166091:22)
at Request.emit (events.js:315:20)
at Request.<anonymous> (/lc/index.js:167060:10)
at Request.emit (events.js:315:20)
at IncomingMessage.<anonymous> (/lc/index.js:166982:12)
at Object.onceWrapper (events.js:421:28)
at IncomingMessage.emit (events.js:327:22) {
errcode: 'M_UNKNOWN',
data: [Object],
httpStatus: 500
}
M_UNKNOWN: Internal server error
at parseErrorResponse (/lc/index.js:110946:13)
at /lc/index.js:110901:17
at Request._callback (/lc/index.js:110835:9)
at Request.self.callback (/lc/index.js:166091:22)
at Request.emit (events.js:315:20)
at Request.<anonymous> (/lc/index.js:167060:10)
at Request.emit (events.js:315:20)
at IncomingMessage.<anonymous> (/lc/index.js:166982:12)
at Object.onceWrapper (events.js:421:28)
at IncomingMessage.emit (events.js:327:22) {
errcode: 'M_UNKNOWN',
data: { errcode: 'M_UNKNOWN', error: 'Internal server error' },
httpStatus: 500
}
Again, as it should, it was not fatal and just emitted the correct matrix/presence/failure
action.
Without presence, we can't communicate with them, we can't request for the secret, nor continue transfer's protocol, therefore it rightfully expired on initiator after enough time. We may check on retrying 500 matrix errors on presence fetch as well, or maybe on every matrix API entrypoints.
Description
stack trace:
scenario_player.exceptions.RESTAPIStatusMismatchError: HTTP status code \"500\" while fetching
http://127.0.0.1:47881/api/v1/payments/0x59105441977ecD9d805A4f5b060E34676F50F806/0x5409D2B144a450C459A86c6352898Ef759EdE198
from: 0, to: 3, amount: 1_000_000_000_000_000, lock_timeout: 30
scenario-player-run_bf4_multi_payments_same_node_2020-11-04T01:39:53.log
node0 and node3 logs: The node logs are empty and dont really reveal much about the internal server error of the the raiden node. bf4_node0_node3_04thNov2020.tar.gz
Acceptance criteria
Tasks