REST API discussion - Githubissues

abitmore commented 6 years ago

This ticket is for discussing adding REST API support.

How EOS is doing REST API is mentioned in https://github.com/bitshares/bitshares-core/issues/792#issuecomment-383532647 and https://github.com/bitshares/bitshares-core/issues/792#issuecomment-383419223 by @Zapata.

Note that EOS also have moved from JSON-RPC to REST based API (see doc).

Exposing Bitshares in a REST API style might be nice in the long term, but I don't think it worth the effort for the "API Documentation" subject. It will require to expose both style for a smooth migration as all the clients need to be updated. For reference EOS has a htttp_plugin, wich is used in this way.

Here is another project about REST API: https://github.com/bitshares/Bitshares-HUG-REST-API

clockworkgr commented 6 years ago

Also note my response here: https://github.com/bitshares/bitshares-core/issues/792#issuecomment-383548129

I am also currently trying to get in touch with the loopback team (https://github.com/strongloop/loopback) to pick their brains at how suited loopback would be to the scenario I describe in the comment above.

Zapata commented 6 years ago

@clockworkgr I agree with your comment. Stellar has Core (consensus with in memory data) + DB (PostgesSQL with history data) + Horizon (HTTP API), see the doc.. Steem is doing the same with Hivemind (only for read only APIs currently).

For read only an alternative approach is GraphQL, which let the client shape the API it requires (C++ lib). But imo this kind of things should be in a separate process than the core.

grctest commented 6 years ago

@abitmore Great to see that my HUG REST API for BTS was forked into the Bitshares org, however in its current state it solely provides read-only block explorer style functionality (as it's all I needed for the Beyond Bitshares google bot). It's entirely possible to expose the non read-only python-bitshares functionality through the HUG REST API as optional components however additional security considerations would be neccessary & there are also additional lines which are unneccessary for public nodes (regarding API keys) which could be removed.

I was thinking recently that running a private nodejs cloud function would be better than external queries to hug (given extra firebase charges for external queries), though it's a lot of work and cloud functions on firebase run an old nodejs version. If loopback isn't suitable then there's (several other frameworks available)

In terms of Python REST API framework performance, HUG is slightly slower than falcon (HUG based on falcon) and pycnic, though it's pretty much negligible:

There are more web frameworks which have serious potential in the future but are currently still not production ready, such as Sanic (sanic-restplus, restic & sanic_crud) and Japronto which could be worth running a REST API on top of..

If we're talking about migrating from RPC To REST, does anyone have any experience with gRPC, Cap'n Proto or Apache Thrift? Perhaps they may be beneficial for improving Bitshares existing RPC functionality/performance?

clockworkgr commented 6 years ago

Loopback was just a thought because I've worked with it in the past and like it as a project, Ideally I want to discuss the feasibility of using looback for that purpose with their team as I'm not an lb guru :)

abitmore commented 6 years ago

@grctest thanks for the info. IMHO we're not going to migrate to REST right away, but perhaps will add REST support, as the first step.

BTW I meant to link your repository directly, but accidentally linked to the fork. Is it OK for you?

grctest commented 6 years ago

@abitmore Sorry for the late reply, the fork link is fine.

I don't think that HUG could be integrated directly into bitshares-core, rather it's more suitable as an external solution. It does require nGinx and Gunicorn to run.

clockworkgr commented 6 years ago

Been thinking about how to best tackle this and although designing API architecture is not my strongest point I had a few thoughts I just wanted to share here.

Consider an express (or similar) based app to implement the REST endpoints. All requests go through a caching middleware (utilising REDIS for example) where if a cache entry is found , it is returned immediately. If not, it executes the relevant JSON-RPC call to the node (or calls if combining data from multiple calls to provide more complete endpoints) and caches it

For example, calling /account/clockwork will check first if there is a cache entry for that (and return it) or make the get_account_by_name calls to the node, cache the result and then return it.

At the same time, this express app has also subscribed to new blocks in the background. Whenever a block comes in, we go through the ops in it and based on those operations and a set of rules which we will have to define, we decide what cache entries need to be deleted/invalidated. Thus ensuring that the REST API always returns correct data and caches whatever can be cached.

As far as subscriptions are concerned, I was thinking that we could make use of Server Sent Events and the EventSource API. That way , for example in a web browser, we could do something like:

var subscriptionURL = '/account/clockwork/subscribe';
var eventsrc = new EventSource(subscriptionURL);
eventsrc.addEventListener('data', function(msg) {
  var update = JSON.parse(msg.data);
  console.log(update); 
});

where that endpoint simply returns a unique URL that the client declares as an EventSource and receives updates for that object.

Ofcourse we can easily have many clients subscribing to the same objects using only a single subscription between express app and witness_node.

It goes without saying that a set up like above would allow for easier addition and implementation of new API calls, cool features like rate limiting , access control etc. and ofcourse much larger scaling potential.

Thoughts?

abitmore commented 6 years ago

@clockworkgr caching can be a PITA.

Whenever a block comes in, we go through the ops in it and based on those operations and a set of rules which we will have to define, we decide what cache entries need to be deleted/invalidated.

This is hard. Unless the node pushes all virtual operations (e.g. order filling / feed expiration) but not only the blocks. Also we need virtual operations for everything that caused a change on data. E.G. vesting object will update in every block due to witness pay. Basically the "set of rules" is the whole chain logic, or say, the consensus.

BTW Steem has done quite some progress with this approach.

Alternatively, the middle-ware can subscribe to changes on objects (but not operations) to decide what part of cached data need to be refreshed. This means re-implement the object database in the middle-ware, or re-implement all API logic there.

A simpler approach is to invalidate cache every 3 seconds, this will guarantee little caching issue, but with less performance gain.

clockworkgr commented 6 years ago

Maybe a combination of both?

So keep the ones where invalidation rules are simpler to be invalidated based on block content.. And more complex ones invalidate every 3 secs.

However, even with the 3 sec cache and no other optimisation, if we're talking big scaling it will make a hell of a lot of difference

abitmore commented 6 years ago

I agree that 3 sec cache would improve performance a lot if for big scaling, but it wouldn't be significant for small scaling. Knuth's "Premature Optimization" rule may apply here. Perhaps take a look at something like mod_cache provided by Nginx? I don't know if it would work for websocket.

Combination would be fine.

By the way, there is API caching mechanism in the node, currently disabled due to bugs. It was designed to reduce database query.

clockworkgr commented 6 years ago

Knuth's would apply if it was detrimental to the work done in core. Seeing as this would be a separate effort (ideally by separate people) I don't think it really applies :)

Also, I think (judging from our telegram chat) that we agree it will be needed at some point. So might as well get a head start.

nginx and equivalents simply tunnel requests so mod_cache wouldn't work. they have no interaction to the content in the ws connection.

clockworkgr commented 6 years ago

Interestingly , researching the SSE suggestion for update subscriptions I mentioned combined with the suggested use of Redis as a cache, I came across this:

https://github.com/toverux/expresse#ssehub-middleware

Utilising: https://redis.io/topics/pubsub

Between these , there seems to be enough bits and pieces for a PoC

clockworkgr commented 6 years ago

@ryanRfox Just making sure you've followed this discussion

ryanRfox commented 6 years ago

Assigning to @Zapata to work in conjunction with #792

bitshares / bitshares-core

REST API discussion #870