canonical / pebble

Take control of your internal daemons!
https://canonical-pebble.readthedocs-hosted.com/
GNU General Public License v3.0
143 stars 54 forks source link

feat(client): expose a default Requester interface #310

Closed flotter closed 10 months ago

flotter commented 11 months ago

The patch adds a public Requester interface which allow derived projects to use the Pebble instantiated client and extend the available commands. The extended client commands will use client.Requester().Do for HTTP requests.

Derived project (imports Pebble):

// exClient.go:
type ExClient struct {
    *client.Client
}

func (ex *ExClient) Reboot(opts *RebootOptions) error {
    payload := &devicePayload{
        Action: "reboot",
    }
    :
    if _, err := ex.Requester().Do(...); err != nil {
        return err
    }
    :
}

// cmd_reboot.go
type cmdReboot struct {
    client *exClient.ExClient
}

func init() {
    cli.AddCommand(&cli.CmdInfo{
        Name:        "reboot",
        Summary:     cmdRebootSummary,
        Description: cmdRebootDescription,
        New: func(opts *cli.CmdOptions) flags.Commander {
            return &cmdReboot{client: &exClient.ExClient{opts.Client}}
        },
    })
}

func (cmd *cmdReboot) Execute(args []string) error {
    if len(args) > 0 {
        return cli.ErrExtraArgs
    }

    return cmd.client.Reboot(&exClient.RebootOptions{})
}

The changes made in the patch has been done in a way to produce as small as possible diffs (we keep doSync/doAsyc wrappers). The default interface has been implemented in the client.go file to allow reviewers to easily identify which code was added, and which is unchanged.

The following changes are made:

  1. The ResultInfo type previously returned by doSync and doAsync private function are removed. Although this is a publicly exposed type, the return value as of today has always been discarded, and the struct is currently empty.

  2. ResultInfo has been replaced by RequestResponse.

  3. The client Hijack mechanism which allowed the doer to be replaced has been removed. This was originally added in snapd for an arguably slightly obscure use case. We now have a new Requester interface, so any future need for this will instead be built in top of the Requester interface.

  4. The logs client request now uses the same retry logic on GET failure, as any other GET request. This is previously not possible because the retry logic and response unmarshall code was bundled, not allow raw access to the HTTP body.

benhoyt commented 11 months ago

I've had a look over this now and I small play with it in a test CLI that implements a custom client. Maybe it's just me, but I find the approach, and some of the naming, hard to understand.

The approach:

I find the new requester and decoder concepts (and interfaces) indirect and a bit confusing. I know we initially discussed just exporting a Do or Request method on the Client type. It seems to me that would be so much simpler, and custom clients could just instantiate a client and call Request right away. Why was that approach rejected again? I understand we don't just want to export the current DoSync and DoAsync methods -- we want to carefully consider the API, but still, that basic approach seems much more direct.

It would be helpful to add doc comments on the Requester interface and its methods describing exactly what the methods should do. "Do performs a single client request, including decoding to the result parameter, blah blah blah. This is what result means, and how it differs from the returned RequestResponse, etc." Similar for SetDecoder.

Why do we need SetDecoder or the decoder to be customisable? If a custom client wants to customise decode, can't they just do whatever they want in their Do implementation? This is another concept and indirection-via-function that the user needs to get familiar with.

type DecoderFunc func(ctx context.Context, res *http.Response, opts *RequestOptions,
                      result interface{}) (*RequestResponse, error)

Why does a decoder take a (cancellation) context? Could it take just the body instead of the entire http.Response? Why does "decoder" handle websocket-y stuff? On that note, I wonder if we can simplify by leaving websockets out of this API, or at least not to to squeeze it into the same method. Websockets are only used by exec and seem like a pretty niche case.

I don't really understand what the intended use for BodyReader is. On that note, I think it might really help to update the PR description with usage example of how this stuff will be used in another project / custom client.

Naming and other minor points:

benhoyt commented 11 months ago

I had a voice discussion with @flotter and wondered why we can't use a much simpler version that passes in an http.Client (or http.Transport) when creating a new Client, and then exports Do directly as a method of Client. Like so:

// client.go
type Client { ... }

type Config {
    SocketPath string
    ...
    HTTPClient *http.Client  // new field (or perhaps http.RoundTripper)
}

func New(config *Config) { }  // existing function

// new method on Client type
func (c *Client) Do(ctx context.Context, request *Request, result interface{}) (*Response, error) { }

This saves the complexity of the new interface, the awkward SetDecoder (with an implementation's Do calling the decoder that was set), etc.

benhoyt commented 11 months ago

tl;dr: I'm not happy with this as is. It's unnecessarily complex, and the new interfaces are hard to implement -- I've implemented a custom requester and decoder to try it. It's not an API I'd be proud of, and I think we can solve the problems at hand in a simpler way.

Existing approach, with custom Requester and DecoderFunc

I find the decoder interface particularly confusing: you create a client with a Requester, which has a SetDecoder method, and then there's DecoderFunc, which the requester should call, but the decoder has to decode the entirety of the response (a complex thing to define and describe for an interface!), set values in the client, and so on. It's a real tangle.

Draft PR showing my custom requester and decoder and a test program to drive them.

When you implement a DecoderFunc -- which you have to do if you're implementing a Requester, because we're not exporting Client.Decoder -- you have to reimplement a lot of the tricky stuff in the client anyway: Pebble server response types and decoding them, checking error/sync/async types, decoding the change ID, and so on. These are fiddly details that would take as long to document what the DecoderFunc needs to do as actually do it. A custom decoder is over 100 lines of fiddly code copied from client.go.

If we exported Client.decoder you could potentially implement your custom decoder by wrapping Client.Decoder (though you'd have to read the body and then provide it at Client.Decoder as a bytes.Reader, because you can't read the http.Response.Body twice).

When you implement a Requester, the Do method has to perform HTTP request, but also handle things like retries and the other stuff the client does now -- together that's about 50 lines of non-trivial code copied from client.go.

In addition, because of how the Requester has to support websockets and has a Transport() which returns an *http.Transport, it's not transport-agnostic anyway. So is it even realistic to create custom requesters in the first place?

I understand the desire to conceptually separate the Requester, Client, and decoder. However, it creates a lot of complexity, as I've described above, and as can be seen below with the amount of new API surface we're proposing in this PR:

Click to expand new API surface ```go type DecoderFunc func(ctx context.Context, res *http.Response, opts *RequestOptions, result interface{}) (*RequestResponse, error) type Requester interface { Do(ctx context.Context, opts *RequestOptions, result interface{}) (*RequestResponse, error) SetDecoder(decoder DecoderFunc) Transport() *http.Transport } type RequestOptions struct { Method string Path string Query url.Values Headers map[string]string Body io.Reader Async bool ReturnBody bool } type RequestResponse struct { StatusCode int ChangeID string Body io.ReadCloser } func (client *Client) Requester() Requester { ... } type DefaultRequesterConfig struct { BaseURL string Socket string DisableKeepAlive bool UserAgent string } type DefaultRequester struct { baseURL url.URL doer doer userAgent string transport *http.Transport decoder DecoderFunc } func NewDefaultRequester(opts *DefaultRequesterConfig) (*DefaultRequester, error) { ... } ```

Still, my draft PR shows my custom requester and custom decoder used to implement MyClient, a new client (like the kind that might be used in Termus) with a new method UpperServices (which really just calls /v1/services, but pretend it's a new Termus-specific server method).

I use the custom decoder to decode a pretend new Termus-specific ServerVersion field that's included in all sync responses (just as an example). This is like the existing warnings and maintenance fields.

Another oddity: the custom decoder decodes warnings/maintenance fields into fields on MyService, but then the Pebble Client.WarningsSummary() and Client.Maintenance() are no longer correct. We don't have access to the Pebble Client fields to set those.

It all works, though I did have to add a client.Config.Requester field, as currently in the main PR there's no way to actually use a custom Requester.

Overall, I think it's painful to use, with a lot of boilerplate copied from the current implementation in client.go.

Let's see if we can simplify.

Simpler option 1

As mentioned above, I don't think it's useful (or even possible with websockets / Transport()) to separate the Requester from HTTP, so let's just try Client methods instead.

The custom decoder above has to decode the entire response, so let's instead provide a "decode hook" which allows setting a custom function to just read or unmarshal the parts it wants (like ServerVersion), but have the main client still do all the messy sync/async/error checking and unmarshalling. Providing a decode hook adds to that.

In option 1 (draft PR), I've taken the existing Do method signature from the proposed Requester interface and made it a method on client. I've kept RequestOptions and RequestResponse as is:

func (client *Client) Do(ctx context.Context, opts *RequestOptions, result interface{})
    (*RequestResponse, error) { .. }

For the decode hook, there's this new API:

func (client *Client) SetDecodeHook(hook DecodeHookFunc) {
    client.decodeHook = hook
}

type DecodeHookFunc func(data []byte, opts *RequestOptions) error

You can see how simple a sample implementation is here -- it just decodes the ServerVersion field and is done.

Click to expand sample decode hook ```go func (c *MyClient) decodeHook(data []byte, opts *client.RequestOptions) error { // Demonstrate use of opts: only do custom decode on v1 requests. if !strings.HasPrefix(opts.Path, "/v1/") { return nil } var frame struct { ServerVersion string `json:"server-version"` } err := json.Unmarshal(data, &frame) if err != nil { return err } if frame.ServerVersion != "" { c.ServerVersion = frame.ServerVersion } return nil } ```

I realize it's slightly less efficient, as it will have to unmarshal a second time (the client unmarshals as well). But 1) it's a small price to pay for how much simpler this approach is, and 2) I think it's likely decode hooks won't be used often.

I've also solved the websocket issue by adding a Client.Transport() *http.Transport method. See the MyClient.GetWebsocket implementation that uses it.

Simpler option 2

Option 2 (draft PR) is similar to the above, but I've broken Do into what are really three functions: DoSync, DoAsync, and DoRaw (we could bikeshed on that name, maybe DoReturnBody). I know something similar was proposed in the past and rejected, but I think I've cleaned up the API significantly here.

I don't like how RequestOptions has bool Async and ReturnBody input parameters, which is really a three-way switch (Async is ignored if ReturnBody is true).

In addition, they should really take and return different parameters in each mode:

Separating them out is clearer, and means the parameters and return values are specific to the operation (making invalid combinations impossible at compile time).

One other tweak here: because method, path string are required, I've moved them to parameters, and RequestOptions becomes only the optional parameters. This method, path parameter style is also consistent with the stdlib's http.NewRequest* functions.

Here's the option 2 methods:

func (client *Client) DoSync(ctx context.Context, method, path string, opts *RequestOptions, result interface{})
    error { ... }

func (client *Client) DoAsync(ctx context.Context, method, path string, opts *RequestOptions)
    (changeID string, err error) { ... }

func (client *Client) DoRaw(ctx context.Context, method, path string, opts *RequestOptions)
    (io.ReadCloser, error) { ... }

Compare a simple do call with option 1 -- not bad, but not great:

opts := &client.RequestOptions{ // Async: false is implicit
    Method: "GET",
    Path:   "/v1/services",
}
_, err := c.pebble.Do(ctx, opts, &result)

With option 2 -- simpler and more type safe:

err := c.pebble.DoSync(ctx, "GET", "/v1/services", nil, &result)

Again, option 2 adds Client.Transport() to solve the websocket issue.

Summary

I believe the suggested options solve the problem of adding Termus-specific functionality and custom decoding. They do so with much less API surface, less code in the client.go implementation, and they're significantly simpler to use when you're importing Pebble or writing a custom decoder.

Of the two option, I prefer option 2 for the reasons stated there. It does add 3 new exported API methods instead of 1, but they're all clear and the parameters and result types are consistent with the operation.

Later, if we really do need to replace the entire request handling with something non-HTTP, we can add a way to inject a net.Dialer or an http.RoundTripper. (But either way we'd have to solve the Transport() *http.Transport issue for websockets -- the original proposal doesn't help with that.)

Perhaps we can have a discussion about this to avoid going round in further circles.

niemeyer commented 11 months ago

tl;dr: I'm not happy with this as is. It's unnecessarily complex, and the new interfaces are hard to implement -- I've implemented a custom requester and decoder to try it. It's not an API I'd be proud of, and I think we can solve the problems at hand in a simpler way.

Thanks for taking the time to describe these alternatives in detail, Ben. Unfortunately, they seem to ignore the conversation we had with the team on Tuesday where I detailed why we're trying to avoid having methods on the Client itself: these are low-level HTTP APIs, and there are cases such as the use of web sockets and logs that are both unhandled. If we go down the suggested route, then every low-level transport concern becomes a "just put another method on the client" kind of conversation, which transforms the pebble.Client from being a nice represnetation of high-level operations that you can perform in Pebble with a mixture of low-level concerns including HTTP, transport, websockets, etc. As you put it, that wouldn't make me very proud either.

With that said, I've worked with Fred today to simplify the current proposal further, and drop from it concerns that we do not need to solve today, such as how to implement an external requester. That means the Requester interface is exposed, but can change because none of our own APIs publish a way for third-party requesters to be used. With this change, we take away the needs of having an exposed decoder, which I will document below for posterity regardless.

As explained on Tuesday, the previously proposed decoder interface allows third-party code that leverages Pebble's foundation to use the Requester interface, while still allowing the Client itself to capture auxiliar metadata that is transferred to the client as part of request response. Warnings are a good example: any sync or async request may return warnings which are unrelated to the request at hand. As the third-party logic performs a Do request, we don't want it to be the responsible for decoding, or even knowing about, all such side metadatac carried by the request.

That's why the interface is less trivial than it could be otherwise. Either way, we'll simplify it further and keep some of these concerns private while we can do that.

benhoyt commented 11 months ago

Thanks, @niemeyer and @flotter. I'm much happier starting simple and expanding later as we have concrete use cases for a decoder.

Unfortunately, they seem to ignore the conversation we had with the team on Tuesday where I detailed why we're trying to avoid having methods on the Client itself

Sorry about that. I tried to address that with: "I understand the desire to conceptually separate the Requester, Client, and decoder. However, it creates a lot of complexity..." But I got a bit carried away with my alternative proposals and sort of forgot where we ended up with our discussion in the process.

I'm pretty happy with your simplified Requester proposal. I've had a play with it here. However, if there's going to be no way to implement an external Requester, that's all a bit moot. And we can probably get rid of NewDefaultRequester and not export DefaultRequester or DefaultRequesterConfig at all -- at least until we support creating external requesters.

Separating out DecodeResult to a method on RequestResponse is an interesting idea -- I think I like it: despite meaning two calls where one would have done before, it's a clean separation.

niemeyer commented 11 months ago

@benhoyt

I'm pretty happy with your simplified Requester proposal. I've had a play with it here. However, if there's going to be no way to implement an external Requester, that's all a bit moot. And we can probably get rid of NewDefaultRequester and not export DefaultRequester or DefaultRequesterConfig at all -- at least until we support creating external requesters.

Just to be clear, this was indeed the actual agreement from last week. The interface itself is not moot because it reserves the ability to have different requesters in the future, but we do intend to make the concrete implementation private for now.