interledgerjs / ilp-connector

Reference implementation of an Interledger connector.
Other
135 stars 53 forks source link

Potential peering race condition #492

Open uroshercog opened 5 years ago

uroshercog commented 5 years ago

When peering with another connector using a peer relation, there seems to be a race condition, as it occurs at random.

You can reproduce this issue by sending a route control request immediately after sending a packet approving the authentication credentials. The connector then tries to handle the request before it finishes processing the previous packet and adding the connector to the peers list (my assumptions).

Here are the logs

2019-07-22T20:27:19.381Z connector:route-broadcaster:trace debug add peer. accountId=ref3 sendRoutes=true receiveRoutes=true
2019-07-22T20:27:19.381Z connector:route-broadcaster:trace debug reload local and configured routes.
2019-07-22T20:27:19.404Z connector:error-handler-middleware[ref3] debug error in data handler, creating rejection. ilpErrorCode=F00 error=BadRequestError: cannot process route control messages from non-peers.
    at RouteBroadcaster.handleRouteControl (/.../ilp-connector/src/services/route-broadcaster.ts:209:13)
    at CcpController.handleRouteControl (/.../ilp-connector/src/controllers/ccp.ts:40:27)
    at CcpController.handle (/.../ilp-connector/src/controllers/ccp.ts:28:21)
    at PeerProtocolController.handle (/.../ilp-connector/src/controllers/peer-protocol.ts:30:33)
    at IlpPrepareController.sendData (/.../ilp-connector/src/controllers/ilp-prepare.ts:40:42)
    at Core.processData (/.../ilp-connector/src/services/core.ts:41:42)
    at handleData (/.../ilp-connector/src/services/middleware-manager.ts:189:65)
    at dispatch (/.../ilp-connector/src/lib/utils.ts:78:14)
    at next (/.../ilp-connector/src/lib/utils.ts:79:16)
    at method (/.../ilp-connector/src/middlewares/stats.ts:31:32)
matdehaast commented 5 years ago

Hi @uroshercog .

Whilst there could be a race condition here, I don't think this is an issue as the code that calls that will retry sending the Route Control message until it is successful.

uroshercog commented 5 years ago

Hello @matdehaast,

I agree with you that this is a transient issue which gets resolved eventually, I don't think relying on this is the right way to go, because there might be other implementations that act differently. To resolve this issue elegantly without too much code change, the connector should send T00 instead of F00, because the latter should not be retried, according to the spec.

What do you think?

matdehaast commented 5 years ago

Hmmm,

Perhaps, but you can't know ahead of time this is a transient issue as you would not know whether or not the peer will/will not be created at a future date.

The CCP spec hasn't been formally established yet and is on the TODO list. Honestly I would always expect all the implementations to retry sending the route control message as that is the only way to tell your peer to begin sending you route information.

I do understand the issue you are raising though I don't think a T00 is the correct error. @adrianhopebailie @sharafian @emschwartz can maybe also weigh in