Build out the responder

yahgwai commented 5 years ago

The responder has the job of ensuring that an appointment always makes it to the blockchain.

what is the interface for an IResponder?
what is the first strategy of the responder?
what possible variables are there to the strategy employed?
what it the scope of that strategy, when will it give up?
what further strategies might it employ?
what does it do when it gives up?
how can the responder be tested?
how does the responder behave under edge circumstances?

bigspider commented 5 years ago

Currently, there is one Responder instance for each watcher, which handles all the responses for the same watcher. I think that it is a cleaner design to instantiate a new Responder object every time a response is needed. Moreover, the Responder object should generate events that can be listened externally for logging/accounting purposes or, and for handling of extreme situations (like failure to respond to an event, which should possibly be escalated to the attention of an administrator, but this is business logic that should not be part of the Responder class.

I think this kind of design would also help to support a future system design where the Responder and the Watcher are located on different machines (communicating by message passing or an API rather than function calls).

yahgwai commented 5 years ago

I agree with your 2nd and 3rd points, but I'm not so sure about the 1st. I think this would be a cleaner design if a watcher wanted to instantiate different types of responder for different appointments, but I can't see why this would be the case at the moment, can you see a reason?

One thing you will lose by moving to a 1 event: 1 responder paradigm is an overview of how many responses are currently in flight. This information might be useful as part of the responders strategy. For now though we aren't using this information either. So at the moment I don't care which design we use, but I am curious as to why you think the 1 event : 1 responder paradigm is cleaner.

bigspider commented 5 years ago

I did think of the watcher as an entity that watched the blockchain for multiple types of responders, but indeed, for now the watcher's job is quite lightweight (as Ethereum has events, and even on Bitcoin the processing is quite simple). The reason was mostly trying to map different entities with my brain image of them, and I associated more a Responder to the specific appointment he is working on. But you have a good point that information from the status of some response might me useful for the strategy; I didn't consider that before.

stonecoldpat commented 5 years ago

I assume there would be an Ethereum and Bitcoin responder? And all the responder does is create and sign the desired transaction? ( and enforcing some upper limit on transaction fee / that value is 0?) And the data is just sent to the responder to craft the transaction?

sr-gi commented 5 years ago

@stonecoldpat that's only for Eth, isn't it. In case of Bitcoin the transaction will be already signed so it has only to make sure the transaction eventually ends up in the blockchain. In case we consider the situation with an additional UTXO for the watcher, then fall to CPFP is the deadline is close enough and the output amount can cover the overhead.

bigspider commented 5 years ago

The job of the "responder" is to do everything he can to get the appropriate transaction accepted in the blockchain, handling retries, and so on. There will be different responders for redundancy (in different machines, using different nodes), and so the strategy should be considered appropriately (e.g. to make it unlikely that several responders try to answer at the same time for the same event).

Bumping the gas fee definitely needs to be part of the strategy for ethereum.

bigspider commented 5 years ago

As part of the refactoring, I'm thinking of removing most of the functionality from the IAppointment subclasses, and move it closer to the Responder; essentially, I'd want the appointment classes to be just data classes. In fact, the appointment objects currently provide the getSubmitStateFunction, which is extraneous to the the logic of the appointment itself and that of the watcher, and it is instead closer to the work done by the Responder. Moreover, the responder is the only place that generates a transaction.

There is a fundamental difference in the blockchain access of the watchers and the responders: watchers need read-only access (so no wallet is needed), while responders need to construct and send a transaction. Thus, every responder need to have an active wallet (and watchers and appointment objects should not have access to a wallet); this will be particularly important when Responders are separate services in separate machines.

I still need to think this through a bit, but I plan to try and refactor the code according to this gameplan. Action items that are part of this refactoring (not exhaustive list):

remove getSubmitStateFunction from IAppointment (and figure out a good place to move it);
remove the signer from the Watcher (the provider is enough);
add the wallet to ResponderManager/Responder.

bigspider commented 5 years ago

Here I'm calling "Responder" the object responsible for responding to a single appointment (or potentially a few appointments), as requested by the watcher. Several responders should be associated to the same disputed appointment, with different strategies in order to (1) minimize the dispute costs, but (2) maximize the chance that Pisa addresses the dispute before the deadline (even under attack or other abnormal blockchain conditions).

I'll use this post to keep track of the progress.

what is the interface for an IResponder? I'm using an abstract base class called Responder, that has a single method "startResponse", and two abstract members that any concrete instance needs to implement: submitStateFunction, solely responsible for sending the appropriate transaction to the blockchain, and respond that implements the strategy. Moreover, the Responder generates events to keep track of the state of the response flow. This class should possibly be the base of any Responder (including for Bitcoin).
what is the first strategy of the responder? Periodically trying to sending the transaction at reguar intervals seems a sensible enough strategy; subsequent attempts should try to identify and solve the reason why previous transactions failed (eg.: fee not high enough). First attempts should try to minimize the fees paid, but strategy should be more aggressive if closer to the dispute deadline.
what possible variables are there to the strategy employed? Blockchain fee estimates, availability of the provider (react to non-responding nodes), time since the last block (might imply that node is falling out of sync).
what it the scope of that strategy, when will it give up? The Responder should only give up (raising an alert) if it knows that it will not be able to complete his job. When getting too close to the deadline, it should raise an alert even if still trying.
what further strategies might it employ? As discussed above, it might be sensible to have high throughput Responder implementations that try to use a single wallet to respond to many disputes (potentially concurrently). Responders of this kind have a much higher throughput and thus are likely to be enough in normal circumstances, but they are easier to attack for an adversary that can cause forks in the blockchain, as he could invalidate many transactions at once (by accepting first a later transaction with higher nonce).
what does it do when it gives up? The Responder will generate an appropriate event to be handled by a higher level system (e.g.: hire a responder with a different account / node / etc.)
how can the responder be tested? Unit testing should go a long way to test that a Responder is implementing the strategy correctly. On a higher lever, we'll need integration tests and other kinds of stress tests for the whole ResponseService. On the long run, we should develop/find a suite of tools to mock the blockchain, different nodes, different network conditions, being under attack, etc.
how does the responder behave under edge circumstances? As of now, the edge circumstances that are handled are: (1) node unresponsive => fail; (2) transaction stuck => try again bumping the fee; (3) a re-org kicked out our transaction => try again.

yahgwai commented 5 years ago

Just a comment on the note above. It is an Ethereum consensus rule that transactions be accepted in ascending nonce order, and no nonces can be skipped. Eg if 1 is the current nonce for an account, only transactions of nonce 2 can be accepted, nonce 3 transactions cannot be. This does bring up another problem that if a transaction gets lost it can hold up subsequent transactions.

bigspider commented 5 years ago

Closing this after #114 was merged; I will open separate issues for the work that is still to be done.

PISAresearch / pisa

Build out the responder #93