anoma / green

https://anoma.github.io/anoma/
MIT License
7 stars 0 forks source link

Decoupling of logic and service components of Engine #793

Closed karbyshev closed 2 months ago

karbyshev commented 3 months ago

Observations:

Problem: Keeping messages of all types in the same queue is problematic, since this may result in a large message queue causing the dump messages being delayed, which may lead to an inconsistent global state.

Solution: To address this issue, in what follows we propose an alternative structure of the Engine. More specifically, an engine should be a supervisor with (at least) two following subcomponents: (1) a logic genserver and (2) a separate genserver (named service component) which keeps track of the good states of the logic server. One possible trivial implementation of this proposal is to ensure that once a message is successfully processed, the logic server sends its updated state to the service component.

Importantly, the service server must satisfy the availability property: it should be safe to assume that it cannot fail (except maybe in the case of OOM or due to other hardware related issues). To ensure that, the engine structure should be kept simple with as few operations as possible, e.g., a) updating the current good state of the logic component and b) returning the saved state upon request (e.g., :dump request).

With the proposed structure of the engine, the mailbox of the service component should be always kept short, and thus asynchronization with other components and inconsistent global states should not occur.

Apart from state dumping, the service engine could also be applied to solve the stop-world problem, which is concerned with synchronous suspension of all work on all components for achieving a consistent state (i.e., a state reachable from an initial state of node). More specifically, the service component could save not only a single good state but a bounded number of good states indexed by an abstract time stamp (e.g., a chain height or something similar). In order to get a consistent global state, dumper could query every component for a state indexed by a given time stamp.

agureev commented 3 months ago

If I understand everything correctly this does not fully solve the issue of desynchronization (I outline my question regarding this below), but it is definitely an improvement on the current snapshotting design! I outline some additional proposals below.

Proposals:

  1. State updating for the second agent should be done synchronously via calls so that the mailbox is always minimal, i.e. there will never be a queue of messages other than [sevice_message, update_state_call]
  2. The PID's of these second agents should be kept not only in the engines themselves but in some agent which does not usually receive messages, so that we can - without a queue - send it a service message which then gets delivered to the backup-state agent. My proposal is that the Node should contain all this info.
  3. I do not see much of an advantage in using the backup agents for snapshotting vs using the current dumping mechanic producing files with all the binary info. The reason being that such snapshotting would need to synchronize also the table info from mnesia. This means that to actually use that for snapshotting we would need to have the agents also dump tables at the same time and store them in some format in their extra field. This introduces extra logic to the agents making them less robust. I would say it is beneficial for them to be kept more robust and just keep actual snapshotting to the usual dumping mechanic. Since also a) either way we need to have such mechanic storing info about a state offline b) this decreases the info stored inside the agents.

Questions:

Regarding desync. Now we cannot send messages to different actors at the same exact time. Messages will probably be sent using some Enum.map on the list of backup agents. So suppose we have a list of such agents with a sublist [mempool_backup, storage_bakcup]

Suppose we send the mempool_backup a :dump message through e.g. the Node. Then at the same time somebody sends a message to the Mempool (since we decouple service and protocol messages this should be possible) to start a new transaction worker. Right afterwards, a message to storage_backup is sent. Now the real question is, could something like this happen:

The :dump message gets received by the mempool backup agent before the change of state after calling the transaction. Then the transaction tx becomes written with its order by the storage during this process (I may be wrong on this account as new scry is not yet finished, but this is something similar to what we have right now), so appropriate info gets written to the appropriate mnesia table. The mnesia table gets updates "before" the :dump message gets received by the storage backup agent, so the table that it will dump will be an updated one.

If all of the above is possible, then we will have a mempool state where a transaction tx is not present in its field of transactions that have not yet been committed to a block, while the table used for the snapshot will already have that transaction written in it. Desync.

Possibly the scry will have mechanisms to avoid this but I assume if the :dump message theoretically can be received really slowly as I described above, any call or cast to between Engines which happen in the middle of someone else's handle_call or handle_cast can cause such desync.

If such a slow processing speed of service messages is impossible, then that's perfect, but I do not know enough about possible hardware limitation here to know that for sure. @juped @mariari would be grateful if you by chance know whether the situation describe above is feasable.

karbyshev commented 3 months ago

Now we cannot send messages to different actors at the same exact time.

Correct, but this is not a problem.

Suppose we send the mempool_backup a :dump message through e.g. the Node. Then at the same time somebody sends a message to the Mempool...

Not a problem, since the :dump message will contain a timestamp of interest.

agureev commented 3 months ago

Just to make sure: @karbyshev here refers to the model in which it is not a centralized message being sent from one engine to all other engines using some Enum.map([engines(), fn x -> send(x, :dump)]) functionality but instead the timestamp is sent transitively in a new message some engine sends (based on our discussion, correct me of I am wrong here, @karbyshev)

this relies on some engine being a starting point of all state-changing interactions which causally affect the entire engine network

so the engine format will be changed in this case sending Router.cast/call(addr, msg) to Router.cast/call(add, msg, timestamp) or something similar. However in this scenario I am then unsure why we need backup agents.

karbyshev commented 3 months ago

However in this scenario I am then unsure why we need backup agents.

You can view them as a distributed logger.

agureev commented 3 months ago

And what would be the advantages of a distributed logger over a non-distributed one?

karbyshev commented 3 months ago

Distributed-ness and assignment of every event (state change) to a meaningful time stamp.

agureev commented 3 months ago

But if every state-changing event is borne out of a singular source per the hypothesis, then all state-changing messages can be sent to one singular logger with appropriate timestamps as well, is that not the case?

mariari commented 3 months ago

Let's discuss this on Tuesday if possible

mariari commented 2 months ago

I felt like we had a discussion about this partciular in an architecture meeting.

I believe with how the system currently runs with the new refactoring this is less of an issue.

However there are valid concerns here.

@karbyshev do you mind making this a forum topic on the research forums?

I will close this issue as it's not actionable but we should have further discussions on the forums about this topic.