Closed i-norden closed 3 years ago
Thanks for writing this up @i-norden.
I'm a bit hesitant at the moment to introduce these mechanisms directly into the SDK. Rather, I would prefer to see an interface that potentially each streaming/writing mechanism can implement. The implementations of this interface (e.g. RabbitMQ, ZMQ, Redis, file, etc...) can exist outside of the SDK and we can provide out-of-the-box implementations for users. In addition, the SDK can include a common and simple "base" implementation (e.g. file).
That makes sense! I'm going to open a new issue with that interface, or should I include that as part of the ADR from #7888?
I would write this all up in a single ADR. I don't think any of these changes are contentious 👍
Since the SDK chose to use gRPC as the de facto RPC layer, an alternative to what @alexanderbez proposes is to hardcode the io.Writer
from #7888 as a gRPC server stream. If users want to use Redis/file/RabbitMQ, then they listen on the gRPC service that will be baked into the SDK.
Pros:
Cons:
I like the idea! We'd need to be careful and examine any and all performance implications of having gRPC sit as a proxy. @i-norden do you think the consumer (e.g. RabbitMQ, ZMQ, file, etc...) cares about throughput or performance in general?
Coming from my experience with eth where it is a struggle to keep pace with the head of the chain while listening to (and indexing) state changes there is some knee-jerk concern about performance but I don't think that is relevant here, in large part because of features such as these which will make listening to state changes a breeze in comparison :)
My gut reaction is that those pros outweigh the overhead of the grpc middleman, but I admittedly don't have a great feel for the performance demands and constraints for cosmos applications yet e.g. how many state updates tend to occur per-block.
If we remain with the io.Writer
interface and provide a concrete gRPC server stream implementation as the standard for backing it, people could still implement a more performant/direct io.Writer
if need be.
I think gRPC streaming is useful for general purpose streaming. I do not think it is suitable for caching to a database because it is not fault tolerant. I would prefer to support both paths.
@aaronc apologies, I keep overlooking the persistence needs! I will outline both (file and grpc) approaches in the proposal. We can also include a teeing io.Writer
type to load any number of destination io.Writer
s into to allow us to simultaneously write out to both file and grpc stream.
Sorry don't need to implement anything for that, will just use io.MultiWriter()
I think this issue/proposal can be closed as it has been formalized as part of the ADR-038 specification: https://github.com/cosmos/cosmos-sdk/pull/8012
Summary
State listening external streaming/pub-sub service
Problem Definition
Currently, KVStore data can be remotely accessed through Queries which proceed through Tendermint and the ABCI. In addition to these request/response queries, it would be beneficial to have a means of listening to state changes as they occur by some pub-sub or streaming mechanism.
The changes proposed here are the 2nd step towards achieving that, by exposing the state listeners introduced in #7888 to external consumers.
What problems may be addressed by introducing this feature?
Realtime data availability
What benefits does the SDK stand to gain by including this feature?
Tools for state listening should be beneficial for many cosmos applications, as a fundamental means of improving data availability/accessibility
Are there any disadvantages of including this feature?
If an application developer opts to use these features to expose data, they need to be aware of the ramifications/risks of that data exposure as it pertains to the specifics of their application
Proposal
Question: How should we externally expose KVStore state changes?
In #7888 the idea is that the BaseApp can use
io.Writer
s to listen to specific KVStores. The BaseApp can then pass the io handlers into a streaming/server interface that can route and read out the state changes.It may be better to use a more concrete type than
io.Writer
in #7888 since the type will need to support streaming out the data e.g.I've started to outline some of the potential mechanisms for streaming data out below, hoping to get some feedback on what the first approach should be.
Write to file
Pros:
Cons:
Simple Pub-Sub (gRPC Streaming, WebSockets)
Pros:
Cons:
Pub-Sub using embedded persistent queue
e.g. https://github.com/joncrlsn/dque
Pros:
Cons:
Pub-Sub using persistent Redis queue
e.g. https://github.com/adjust/redismq Pros:
Cons:
GraphQL ontop of Postgres queue (postgraphile) using NOTIFY triggers
Pros:
Cons:
External message service
e.g. Apache Kafka, RabbitMQ, KubeMQ Pros:
Cons:
For Admin Use