Open ramfox opened 3 years ago
Current consensus is to punt & only allow POST requests w/ application json to use json.Unmarshall
Adding a little detail here, @ramfox & I had a discussion about the issues of using our REST API for lib-level RPC calls. Our central conclusion:
**1. The lib package must be able to serve up a RPC-style HTTP API that only accepts POST requests to fixed endpoints with JSON body payloads of input param structs
There are two different personas our API needs to serve:
Qri developers are examples of the first use case. While working on the react frontend, frontend devs have defined a number of data structures in javascript, including things like request params and whatnot. What they really want is just a wire protocol for invoking lib methods that adds as little cognitive overhead as possible, where "cognitive overhead" comes from "translating" javascript -> HTTP API -> lib, and the assumption is the user is familiar with qri's go lib methods.
A great example of the second use case is loading qri data into kepler.gl. kepler doesn't know anything about qri, and instead needs a url that ends in .csv. We need an endpoint like curl https://api.qri.cloud/ds/get/b5/world_bank_pop/body.csv
for this reason.
The RPC HTTP API is really the thing we need ASAP. If we can use all lib methods over HTTP, the rest is ergonomics. We should also be trying to remove complexity form the RPC API as much as possible. For example we should abandon any effort to have lib.HTTPClient
do anything other than JSON POST requests. It's adding unnecessary complexity that doesn't serve either audience.
I've taken to thinking about the RESTful API as sugar. It shouldn't add anything you can't theoretically do with the RPC api, but will do it in fewer steps, with less required knowledge of how qri works internally. To that end, @ramfox & I have come to believe we should dramatically cut down on the amount of RESTful endpoints we ship with v0.10.0. Every "sugar" endpoint gives users another way to do something that they could just do with the RPC-style API, and we should wait for them to ask for it, adding routes slowly to avoid bloating the API. Here's what I'd propose as the initial set of "sugar" routes to add atop the RPC API:
endpoint | HTTP methods | Lib Method Name |
---|---|---|
"/" | GET | api.HealthCheckHandler |
"/health" | GET | api.HealthCheckHandler |
"/qfs/ipfs/{path:.*}" | GET | qfs.Get |
"/qfs/s3/{path:.*}" | GET | qfs.Get |
"/webui" | GET | api.WebuiHandler |
/ds/get/{username}/{name} | GET | api.GetHandler |
/ds/get/{username}/{name}/at/{path} | GET | api.GetHandler |
/ds/get/{username}/{name}/at/{path}/{component} | GET | api.GetHandler |
/ds/get/{username}/{name}/at/{path}/{component}.{encoding} | GET | api.GetHandler |
basically it's just some must-have endpoints for monitoring and a clean set of methods for getting data. else can be done in an RPC convention. We'd keep all the routes from our Proposed API changes, but drop the /{path.*}
parts in favour of routes without suffixes. eg: /ds/apply/{path.*}
becomes: /ds/apply
.
There's a handful of routes that have multiple GET/POST/PUT/DELETE flows for the same endpoint, which for RPC compliance means we would need to split up, which is a bit of an anti-pattern. Our RPC logic works great for 1 trick pony endpoints where we either already support POST or simply tack on. And for those I'm unsure how to handle them properly.
Sample: ConnectToPeerHandler
, PushHandler
For the rest I'm more than happy with the above.
Totally agreed that we have some endpoints that don't map 1-to-1 from endpoint to lib method, and if we go the RPC route we'd need to force them to map 1-to-1. Also agreed it's a RESTful anti-pattern. That's why I'd describe this API as a RPC api with a few RESTful bits tacked on, and not the other way around.
@ramfox & I worked through the list, and the number of endpoints that truly overlap is small. The worst offenders are /dsync
and /logsync
, which are entire sync protocols that work behind a single endpoint & multiplex on HTTP verbs to keep their API footprint as small as possible. I'm choosing to ignore those for the sake of this discussion because, well, they're plumbing.
For endpoints that don't map 1-to-1, I think we need to add disambiguating endpoints. For ones where POST does funny things, or doesn't map perfectly onto the RPC style, I think the RPC style should win, and we should move any necessary "funny logic" to a RESTful route written & maintained in the API package.
Just to add some context, we've gone back and forth on the question of "HTTP as RPC layer" vs "HTTP as RESTful API" and which should take priority, as there are some needs motivated by each that have some into conflict. For example, (1) wants every method to be a POST, with simple urls, 1-1 mapping between lib and api, and all parameters encoded as POST bodies that are serialized from parameter structs. However (2) wants multiple verbs per endpoint (GET vs POST vs DELETE), and dsrefs in urls (such as localhost:2503/validate/dustmop/my_ds
).
As a result, an earlier draft of the our redesigned api spec combined some endpoints (such as fsi.init gaining the functionality of fsi.CanInitWorkingDir by using GET, essentially a read-only operation to ask if a directory was valid for initialization), and moved a number of developer focused functionality to a /plumbing/
prefix. All these plumbing
commands were those that should only be needed while debugging, and were usually only used by one-off cmd
commands.
After more discussion, we've decided to reverse this decision, and move much more to "HTTP as RPC layer", with feature sugar added only where necessary to support RESTful endpoints that external apps will actually use. The major endpoint being /get
, as noted above. What this means is that /plumbing
is now going away, and most endpoints will be losing their ability to operate off of a dsref in the url.
While we would like to support more RESTful functionality, it's probably not worth doing so now, as the higher priority is getting RPC improved, and there's a lack of evidence that external apps actually need nice urls for what they're using qri for currently.
Finally, to double down on the /sync
endpoints (dsync and logsync), I don't even think these should be considered part of our "api" nor part of "plumbing". They're more like a protocol is that is only used between nodes for data transfer. It's totally fine if they don't observe constraints set in place for everything else.
We want the http client to add a ref to particular urls, given the method, setting the stage for use to use GET requests over http rpc.
Okay, so we have two options here. Since we already handle the POST situation, where the request is sent to the endpoint via json api, and we have all the information needed in the POST body, we can ignore this right now and everything will still work.
If we want to "fix" this (aka have ref info in the url), here is my proposal.
We want to have the http client send ref information in the url. However, how is the http client supposed to know that this particular call should require adding a ref to the url? We can't rely sole on the contents of the params or the name of the method, some params use
Refstr
as their field name, others useRef
, and we don't have a mapping of method names to functionality rn. My proposal is similar to how we use theRequestUnmarshaller
interface to determine if we should be composing the params from the form field.We should create a
ToDSRefer
(orToRefer
? w/e name tbd) interface, that requires aToDSRef
method to fulfill. ThisToDSRef
method would return a dsref. In the http client, if a params struct is aToDSRefer
, then we pull out the dsref and useRefToHttpPath
to add the ref to the http path before we send off the request.A
ToDSRefer
, should only be used for params or structures that have a reasonable expectation about having a single ref, ie not used in something like diff or changes that have multiple possible refs. For any endpoints that we want to have a ref in the url, we add aToDSRef
method to the params.This does not solve our problem of whether or not we should be sending more information via POST, or are okay to send as a GET request, because all the information is in our url.
Second option. We pull all the important information needed to use to call a function into our
callable
struct. We record the important bits in callable (the http methods, the endpoint, how to pull out ref from itsInType
). Then dispatch can use that information to set up the call correctly. In the future, we can even create an instance funcAddRoutesToMux
that iterates over eachcallable
and sets up a handler for them correctly on the passed in mux.Then there would be a true 1-to-1 of lib functions to routes & adding a route is as easy as adding a lib function & making sure its recorded properly in the
Attributes
of its method set.