Closed LefterisJP closed 5 years ago
Please let me know how I can help here.
Participants: @ulope @palango, @konradkonrad, @LefterisJP and @rakanalh
The purpose of the meeting was to discuss two topics:
Source routing is an approach where the suggested route is attached to the message, while the alternative is to query the PFS at every hop so that a route is calculated by the PFS until target is reached.
Source Routing
Pros:
Cons:
Query PFS
Pros:
Cons:
@konradkonrad shared an opinion about fees:
I assume all fees operate on a minimal cost calculation. Now the problem with a source routing protocol (initiator pays for pfs and passes route along) is: Mediation fees set the incentives for forwarding without querying PFS, they pay for the capital cost of locking tokens/providing capacity and electricity for running the node, but they don’t set incentives for paying PFS as a Mediator. I believe, the fees should pay every node for knowing how to best route. Also, there is a second fee incentive+routing issue, that I did not bring up in the call: Doing RefundTransfers cannot pay out additional fees (afaict), but they lead to further capital cost/locked tokens. The only reason for doing a RefundTransfer can be to minimize my losses: I already payed capital cost, so refunding lowers the probability of the transfer timing out/expiring entirely. Again, that only works, if I expect the network to do optimal routing (source routing isn’t necessarily optimal over the whole path of a transfer). Worst case scenario: Initiators “find out” that the best way for getting a transfer through is “fan out”: do 5 simulaneous transfers and only reveal the secret of the first successful transfer. Therefore I believe we should align the incentives so that mediators knowledge about good routes is priced in, i.e. an implementation that does avoid any “Dead end routes” at the cost of more PFS queries…
TL;DR: Participants agree that we should implement the query-at-every-hop approach.
Since implementing this will only introduce additional state changes / events, and provided a proper upgrade mechanism is implemented for this (see #3227 and #3275), the implementation here should be backwards compatible.
The change, as we see it right now, would be to replace the internal Raiden routing module with a set of events / state changes that would eventually provide a list of RouteState
s for a given transfer to be used to forward the transfer.
If transport messages format needs to be changed (unlikely), then the change will become backwards-incompatible unless a "Version handshake" is implemented (unplanned).
@hackaugusto please provide us with your opinion on the notes.
To add to my quote:
Cons:
Calculated route could become unusable during transfer (nodes going offline or out of capacity)
I am afraid that using the source provided route is not in the best interest of all mediators: A) A mediator will need to assess the probability of the provided route to fail/succeed anyway, because this is the probability to gain mediation fees vs having capital locked up. B) wealthy attackers can use source routes to lock/drain certain parts of the network by providing dead-end routes that touch as many routes as possible (mitigation may be possible by enforcing max lengths)
Since implementing this will only introduce additional state changes / events, and provided a proper upgrade mechanism is implemented for this (see #3227 and #3275), the implementation here should be backwards compatible.
Using "query-at-every-hop approach" doesn't need any new state changes or events. The only thing needed is to change the get_best_routes function to query the PFS
To add to what konrad said: I actually would expect mediators to run the bundle, which includes the PFS, and the mediator can just not fee itself for its own query
Using "query-at-every-hop approach" doesn't need any new state changes or events. The only thing needed is to change the get_best_routes function to query the PFS
This is true, but the request will be a context switch and so has to be taken out of the state machine.
This is true, but the request will be a context switch and so has to be taken out of the state machine.
get best routes is outside of the state machine, it's called by the raiden service while the state change is being created.
I actually would expect mediators to run the bundle
I guess that depends on the throughput...
@rakanalh @hackaugusto I can prepare a PR for this.
As discussed today @palango I think this is a good idea for you to start on this as it will free others for the other problems we are seeing. Will assign you and if situation changes we can re-discuss.
B) wealthy attackers can use source routes to lock/drain certain parts of the network by providing dead-end routes that touch as many routes as possible (mitigation may be possible by enforcing max lengths)
Again: source routing allows for amplified lock/drain attacks:
A_provided_route
(say B..Y are H
hops)X
tokens with F
tokens in mediation fee over A_provided_route
.X + F
locked tokens, to lock (X + F / 2) * H
tokens throughout the network.There are probably a couple of variations of this attack, some of those could be mitigated by client side offline-checks (i.e. checking for circular paths), but I hope the above illustrates why I am convinced, that you cannot trust other nodes routing information and you will always want to check for yourself for the best route from your mediation hop to the target.
@heikoheiko We just discussed this during lunch again. While some of the incentivisation issues belong to the discussion of mediation fees there is this one issue with source routing that @konradkonrad laid out above in detail (after mentioning it in https://github.com/raiden-network/raiden/issues/3236#issuecomment-454853988 already).
I guess there is one reasonable mitigation, that would need to be included in the PFS message (edit:) and offchain payment message spec:
edit: this comes with a number of drawbacks in regards to potential partitioning of the network:
u will always want to check for yourself for the best route from your mediation hop to the target
don't think so. this is just one of a bag of possible attacks and malfunctions. mediating nodes will need to do some risk evaluation of transfers in general and then decide on the cost and willingness to mediate a transfer. in above case a quick "is provided path reasonably short" could be one check. note the default strategy for a healthy network is by nodes monitoring their neighbours. e.g. in above case Y would disconnect Z if it is answering pings, but not revealing secrets.
so in brief, yes there are attack vectors and we'll need to deal with them. but imho this doesn't lead to the conclusion, that all mediating nodes always need to check with a PFS if a provided route is the best.
Problem Definition
Related issue in PFS is: https://github.com/raiden-network/raiden-pathfinding-service/issues/85 PFS public interfaces: https://raiden-network-specification.readthedocs.io/en/latest/pathfinding_service.html#public-interfaces
Task
Timeline
As discussed with Rakan, this has high priority for the PFS team and an implementation until end of January would be good.