spacemeshos / SMIPS

Spacemesh Improvement Proposals
https://spacemesh.io
Creative Commons Zero v1.0 Universal
7 stars 1 forks source link

SMIP: go-spacemesh API implementation #21

Open lrettig opened 4 years ago

lrettig commented 4 years ago

go-spacemesh API implementation

Overview

We have a robust data model design (#13), and an API design based on it (https://github.com/spacemeshos/api/). We need go-spacemesh to expose the data in the API to several classes of clients, including:

Goals and Motivation

Design

Benefits of gRPC

gRPC allows us to achieve all of these goals, in addition to having the following other niceties:

Downsides to/limitations of gRPC

By default gRPC has a maximum message size limit of 4mb, but this can be increased pretty easily. (We already ran into this once.) I don't foresee any major design or implementation challenges as a result of this.

In theory, the RPC design pattern requires tighter coupling between the client and the server than pubsub (which is very loosely coupled: the publisher doesn't even need to know of the subscriber's existence). In practice I don't think this will be an issue for us since I think we can use the existing pubsub-based events framework for all events, which can be delivered to pubsub subscribers and/or to the API streams transparently.

Proposed Implementation

gRPC vs. existing pubsub framework

pubsub is a low-level message-passing protocol that allows a set of events, such as "new block", "block valid", "new ATX", "reward received", "created block", etc., to be broadcast to any number of subscribers. It's currently being used in a multi node test that allows many node instances to share data very rapidly in order to simulate a network in fast-forward. It can hypothetically be used to pass these same events to downstream clients such as for analytics purposes, or a block explorer.

However, being a low-level protocol, pubsub is missing a lot of the features that we get for free with gRPC, so this sort of use case would require considerable additional effort: developing SDKs/connectors for the clients, handling type conversions, clearly defining the protocol, load balancing, and encryption. Also, pubsub would not support certain required use cases well, such as web/mobile clients.

Finally, many of our API endpoints are in fact remote procedure calls: they take arguments, cause the backend to perform some action, and return some value. This use case is not natively supported in pubsub.

We can pretty easily implement and emulate all of the features of pubsub using gRPC streams—in fact, this design work is already done.

Implementation plan

See https://github.com/spacemeshos/go-spacemesh/issues/1764

Dependencies and Interactions

Dependencies:

Interactions:

Stakeholders and Reviewers

Testing and Performance

Testing: Existing API tests will be rewritten, and expanded, to work with the new API code. New tests will be written for any new functionality added, e.g., grpc streams.

Performance: We may want to do some profiling/performance/stress tests to make sure that the new API code, especially events/streams, does not have a negative performance impact on go-spacemesh. Per @antonlerner, we should also test how many simultaneous connections grpc supports.

antonlerner commented 4 years ago

I think we should separate two purposes of the API, first, is to actually provide an API to Node functions such as GetBalance, ChangeRewardAddress... etc... This I agree that could be changed to use GRPC API.

The other functionality that is discussed here is the "Events" functionality. this will probably not be used by end users which want to connect with the Node, and will serve building apps such as block explorer and dashboard. IMO this is a separate requirement and should be designed a bit different. I think it is best to implement the events framework using the pubsub because it is more robust and will allow more flexibility than GRPC streams in terms of subscribing and un subscribing to different topics on the fly. The pubsub can be used internally in the same process as well as externally from other processes, @ilans has mentioned he wishes to incorporate such framework into the node anyways. Also, as I understand there will be a stream in the API that will serialise either all or several events into one stream, on the development side, this IMO, will require more development effort in the sense that now all these events must be serialised from different parts of the code again. Last, i think it's important to see how many simultaneous connections the GRPC steream can support and what are the software bottlenecks of this solution and compare them to current implementation

Having said that, I'd be happy to discuss and see if we could use streams and get same level of robustness and flexibility using GRPC streams, if you indeed think it is better to implement events that way @lrettig @avive

antonlerner commented 4 years ago

Also note that as part of this change, I think we should also address the local testnet events issue Currently, the local testnet relies on logs to monitor the network and print network status, this can and should be changed to get data using the correct data endpoint

lrettig commented 4 years ago

@antonlerner thanks for taking a look and for the thorough reply! Your timing is great :) I'm working on the non-stream API endpoints for now, and haven't begun implementation of the streams yet.

To respond to a few of your points:

The other functionality that is discussed here is the "Events" functionality. this will probably not be used by end users which want to connect with the Node, and will serve building apps such as block explorer and dashboard. IMO this is a separate requirement and should be designed a bit different.

While I agree there's an important distinction between "one-off" endpoints and the streams, I'm not entirely sure the streams won't be used by end users--e.g., I'm pretty sure that @avive and @IlyaVi plan to subscribe to events in the wallet and to use this to display account-related events to the user, e.g., incoming transactions, rewards, etc. This may be cleaner and easier than polling the node from a design perspective. I think @avive has stronger thoughts on this.

I think it is best to implement the events framework using the pubsub because it is more robust and will allow more flexibility than GRPC streams in terms of subscribing and un subscribing to different topics on the fly.

Curious to hear more about why you feel that pubsub is more robust and makes it easier to subscribe and unsubscribe from different topics.

Also, AFAICT the two are not necessarily mutually exclusive - I think we could have the same set of events exposed using the existing pubsub framework, or grpc streams, or both (modulo questions about serialization and multiplexing, as you point out). I haven't gotten deep enough into the implementation yet to know with confidence.

it's important to see how many simultaneous connections the GRPC stream can support and what are the software bottlenecks of this solution and compare them to current implementation

Totally agree. Would appreciate your advice on how to test these!

Currently, the local testnet relies on logs to monitor the network and print network status, this can and should be changed to get data using the correct data endpoint

Another good point - would love to hear thoughts from @ilans on this. Does the API design as it stands contain the correct endpoints? And is there a preferred protocol for consuming these data?

antonlerner commented 4 years ago

While I agree there's an important distinction between "one-off" endpoints and the streams, I'm not entirely sure the streams won't be used by end users--e.g., I'm pretty sure that @avive and @IlyaVi plan to subscribe to events in the wallet and to use this to display account-related events to the user, e.g., incoming transactions, rewards, etc. This may be cleaner and easier than polling the node from a design perspective. I think @avive has stronger thoughts on this.

It all depends on how it's implemented, if events are raised only as part of the node running, in order for you to get all data you'd need to restart the node and sync from genesis.

Curious to hear more about why you feel that pubsub is more robust and makes it easier to subscribe and unsubscribe from different topics.

As I understand it, the GRPC stream will give you all the data in the mesh in a single stream, so you can't have only part of the data without deserialising it first. the other way around it is to create another endpoint for each datatype and / or identity type (i.e account, node etc...) pubsub can make it more robust in terms that each of the identities and data types can be made a topic and allow much flexible querying and filtering

Also, AFAICT the two are not necessarily mutually exclusive - I think we could have the same set of events exposed using the existing pubsub framework, or grpc streams, or both (modulo questions about serialization and multiplexing, as you point out). I haven't gotten deep enough into the implementation yet to know with confidence.

Can we map out all uses for the streams we know of? this will help us understand how many endpoints we will need to support and whats the equivalent effort (topic selection) we would need to have on our pubsub. I think this will also help us choose better between the two.

Totally agree. Would appreciate your advice on how to test these!

we can read the grpc stream code... also, another advantage of mapping the required endpoints will tell us how many streams will be simultaneously opened and active when querying the node.

avive commented 4 years ago

We have designed the api around services and 3 clients and did the big code review of the api around those services - e.g. node, mesh, global-state, transactions service and we've implemented all review suggestions. I feel that we have a good design and I see little reason to separate facets differently. Everything is mapped out in the current grpc service definitions of these services so I don't understand the ask for mapping things out. All clients use different kind of methods to get what they want - current data, streams for future data and queries for historical data. The wallets definitely need streams so they can stop polling the node on loop like what is done today in smapp which is bad and very wasteful. Also, streams do not give all the data in the mesh in a single stream - what they return depends on what they were defined to return based on the user input filters.

antonlerner commented 4 years ago

how will one subscribe to new data from the stream? also, was this review done with @ilans? He also wants to have certain probes inside nodes and receive some data/events from

lrettig commented 4 years ago

Quick update here, I've begun implementing streams (https://github.com/spacemeshos/go-spacemesh/pull/2061). I created a new singleton struct that basically just stores a list of channels, one per data type that we care about. In the places where events.Publish() is now being called, I'm adding a second call to publish the data element onto the appropriate channel. The grpc endpoint backends listen to the channels they care about (which can be specified by a Filter that's passed in).

I considered integrating into the existing events/pubsub framework but did not for several reasons:

To be clear, I'm talking specifically about how the API backend is implemented internally, not about how data is collected/published externally. I haven't touched the existing pubsub code, and it's likely this API code will be totally orthogonal to it.