mafredri / cdp

Package cdp provides type-safe bindings for the Chrome DevTools Protocol (CDP), written in the Go programming language.
MIT License
725 stars 45 forks source link

Write without waiting for the result? #18

Open matthewmueller opened 7 years ago

matthewmueller commented 7 years ago

Hey there,

I'd like to be able to write in order synchronously without waiting for the replies.

The use case being able to speed up typing in keyboard events, where you need them to run in order (keyDown, char, keyUp), so you want the writes to go through in order, but you don't care about waiting for the result ids to come back.

It would probably just be a simple addition to the API, rpcc.InvokeAsync maybe. Would this be something you'd like to see in here?

matthewmueller commented 7 years ago

I should note that I haven't benchmarked this yet, but I'd imagine it should speed things up a bit. I can provide benchmarks if it'd help with the decision-making process :-)

mafredri commented 7 years ago

Right now it would be possible to execute each command in a goroutine, but order isn't fully guaranteed, and this looks like a use-case where that guarantee is needed.

This kind of extension is something I've been keeping in mind, and we should be able to support it by extending e.g. Invoke with options:

func Invoke(ctx context.Context, method string, args, reply interface{}, conn *Conn, opts ...InvokeOpts) error { ... }

And to support invoking any command asynchronously, we'd change the interface slightly:

type Input interface {
    // ...
    DispatchKeyEvent(context.Context, *input.DispatchKeyEventArgs, opts ...rpcc.InvokeOpts) error

The reason I didn't introduce this yet is that I didn't have any options to put there :-P, but this will be fully backwards-compatible.

Usage (maybe?):

errc := make(chan error, 1)
_ = c.Input.DispatchKeyEvent(ctx, args, rpcc.InvokeAsync(errc))
// ...
err = <-errc

I'm undecided if the options should go in cdp or used directly from rpcc, perhaps cdp is better so the user has less reason to deal with rpcc.

PS. A benchmark would be much appreciated! :-)

mafredri commented 7 years ago

Hey @matthewmueller, I've been giving this a lot of thought but I'm having a hard time finding a really good fit for this API change.

Do you have any ideas for how you would like to use it in practice?

I have some ideas that more or less could work, but I also want the behavior to be explicit and clear when reading the code.

// Option + Wait function.
nw := cdp.NoWait()
c.Page.Enable(ctx, nw)
c.Network.Enable(ctx, nil, nw)
c.DOM.Enable(ctx, nw)
shot, _ := c.Page.CaptureScreenshot(ctx, nil, nw) // Reply cannot be used yet.
err := cdp.Wait(ctx, nw)
// Safe to use `shot`.

// Async as a waitable + option.
async := cdp.Async()
c.Page.Enable(ctx, async.Option())
c.Network.Enable(ctx, nil, async.Option())
c.DOM.Enable(ctx, async.Option())
err := async.Wait(ctx) // or <-async.Done()

Or, running the async commands in their own scope, so to speak:

// NoWait provides non-blocking option, but blocks
// until all commands are completed.
var shot *page.CaptureScreenshotReply
err := cdp.NoWait(func(nw cdp.CommandOption) {
    c.Page.Enable(ctx, nw)
    c.Network.Enable(ctx, nil, nw)
    c.DOM.Enable(ctx, nw)
    shot, _ = c.Page.CaptureScreenshot(ctx, nil, nw)
})

// Separate client that does not wait for results.
err := cdp.NoWait(func(c *cdp.Client) {
    c.Page.Enable(ctx)
    c.Network.Enable(ctx, nil)
    c.DOM.Enable(ctx)
})

// Async as a part of context.
err := cdp.NoWait(ctx, func(ctx context.Context) {
    c.Page.Enable(ctx)
    c.Network.Enable(ctx, nil)
    c.DOM.Enable(ctx)
})

And finally, jokingly (or am I 😜😝):

// Full retard *JavaScript*.
async := cdp.AsyncContext(ctx)
p1 := c.Page.Enable(async) // Error, but p1.(*cdp.Promise) == OK ;P
p2 := c.Network.Enable(async, nil)
p3 := c.DOM.Enable(async)
err := cdp.Await(p1, p2, p3)
matthewmueller commented 7 years ago

@mafredri Really appreciate the thought. I think you may be right that it may not warrant a large API change. The only use case I can really think of where this matters is keypresses, screencasting and scrolls and even then, only in certain cases (where you want to keep your event loop as speedy as possible).

It does make a difference though, I'm planning on doing something more official, but I was seeing around a 70ms improvement by just sending the keyboard event (to a local chrome process) and not waiting for the (empty) response.

I think one way forward is to just expose the raw RPCC API. The piece of the application I need this is in is quite performance sensitive, so I ended up writing a custom websocket implementation for that piece. I would have liked to just use the RPCC for this, but I didn't think it quite fit in the way I needed it. For everything else that I do, I won't need it to be performant like this and I'll definitely prefer just using the regular CDP API with the niceties of c.Input.DispatchKeyboardEvent() 😀

The interface that I ended up going with looks like this:

// Websocket interface
type Websocket interface {
        // Async (wait for write acknowledgement then return)
    Write(ctx context.Context, method string, params interface{}) error
        // RPC style (wait for ID to come back)
    Send(ctx context.Context, method string, params interface{}, reply interface{}) error
        // Subscriber has the same API as CDP's events
    Subscribe(ctx context.Context, method string) (*Subscriber, error)
        // Wait returns when the websocket dies (after a cleanup is tried)
    Wait() error
        // Try to close gracefully
    Close(error) error
}

I'm happy to share more of the implementation if you're interested. Like I mentioned, it's similar to CDP's RPCC, but a little too different for a PR haha.

mafredri commented 7 years ago

The only use case I can really think of where this matters is keypresses, screencasting and scrolls and even then, only in certain cases (where you want to keep your event loop as speedy as possible).

I agree, I too see going async as a special use-case, but definitely one that should exist. In that regard, I think it's an OK first step to only support this via direct interaction with rpcc, at least in the beginning.

Currently the domain commands (strings) are not exported (e.g. "Network.enable") but I'll add those to make it slightly nicer to work with rpcc.

(Sidenote: I've also thought of turning these into types, e.g. network.Enable.Do(ctx, args). This would allow us to e.g. add a second method DoAsync, but this introduces two ways, client & command, to do the same thing which I'm not very excited about.)

It does make a difference though, I'm planning on doing something more official, but I was seeing around a 70ms improvement by just sending the keyboard event (to a local chrome process) and not waiting for the (empty) response.

Wow, that's not insignificant. This piqued my interest, I might have to write some benchmarks for rpcc myself 😆.

I would have liked to just use the RPCC for this, but I didn't think it quite fit in the way I needed it.

Could you expand on this a bit? I would love to see if we can match the offerings of rpcc to your requirements 😁. Is it more than the requirement to execute commands without waiting for the result?

I'm happy to share more of the implementation if you're interested. Like I mentioned, it's similar to CDP's RPCC, but a little too different for a PR haha.

Please do, I'm very curious if this can be aligned with rpcc.

PS. Speaking of performance considerations, I'm thinking of evaluating gobwas/ws as a replacement for gorilla/websocket.

mafredri commented 7 years ago

There is indeed a huge difference, I wrote some benchmarks and ran some tests. Both the sync and async version are executing three Input.dispatchKeyEvent commands over rpcc (rawKeyDown, char and keyUp) to produce a newline.

Page: about:blank, no content:

$ go test -bench=Speed -run=Bench -benchtime 2s -v
BenchmarkRpccSpeed/Sync-4               2000       1309191 ns/op        1129 B/op         33 allocs/op
BenchmarkRpccSpeed/Async-4             10000        329931 ns/op        1101 B/op         27 allocs/op
PASS
ok      github.com/mafredri/cdp 6.132s

Page about:blank, focused <textarea>:

The difference here is 102 (sync) vs 1102 (async) newlines inside the <textarea>.

$ go test -bench=Speed/Sync -run=Bench -benchtime 0.5s -v
BenchmarkRpccSpeed/Sync-4                100       6668981 ns/op        1152 B/op         33 allocs/op
PASS
ok      github.com/mafredri/cdp 0.699s

$ go test -bench=Speed/Async -run=Bench -benchtime 0.5s -v
BenchmarkRpccSpeed/Async-4              1000       1787161 ns/op        1001 B/op         27 allocs/op
PASS
ok      github.com/mafredri/cdp 1.900s
mafredri commented 7 years ago

This is how I'm currently using this in the benchmarks:

errc := make(chan error, b.N*3)
nw := rpcc.NoWait(errc)

for _, err := range []error{
    rpcc.Invoke(ctx, "Input.dispatchKeyEvent", down, nil, conn, nw),
    rpcc.Invoke(ctx, "Input.dispatchKeyEvent", char, nil, conn, nw),
    rpcc.Invoke(ctx, "Input.dispatchKeyEvent", up, nil, conn, nw),
} {
    if err != nil {
        b.Error(err)
    }
}

for i := 0; i < b.N*3; i++ {
    err := <-errc
    if err != nil {
        b.Error(err)
    }
}

Not sure how friendly the above is though, easy to forget to allocate enough buffer space in the channel. Also, you lose reference to which command failed. A nicer API (in that regard) might look something like:

calls := []*rpcc.Call{
    rpcc.NewCall("Input.dispatchKeyEvent", down, nil),
    rpcc.NewCall("Input.dispatchKeyEvent", char, nil),
    rpcc.NewCall("Input.dispatchKeyEvent", up, nil),
}
err := rpcc.Send(ctx, conn, calls...)
if err != nil {
    return err
}
for _, call := range calls {
    err := call.Wait()
    if err != nil {
        return err
    }
}

The downside here is increasing the API surface with more types (Call, NewCall and the function Send). Currently Call is unexported (rpcCall) which I find beneficial for internal usage. My thinking here is that having access to the call makes debugging / logging easier, e.g. finding out which call caused the error, etc. On the other hand, re-using a call is not possible, which might result in confusion and errors.

I'll keep thinking about this some more.