mafredri / cdp

Package cdp provides type-safe bindings for the Chrome DevTools Protocol (CDP), written in the Go programming language.
MIT License
724 stars 45 forks source link

Printing PDF: Chrome memory usage + heavy load & context timeout #101

Closed gulien closed 4 years ago

gulien commented 5 years ago

Hello again @mafredri 👋

I've done some load testing on my API for generating PDFs and I've encountered two issues.

1. Chrome and memory usage

Google Chrome memory usage keeps inscreasing overtime.

I've looked up at puppeteer issues and found those relevant informations:

  1. Only one browser instance (OK in my case as Google Chrome is launched only once)
  2. Re-create browser context each time
  3. Close the page after rendering

Also:

The downside is that Chromium is a complicated application and might accumulate error over time; as a result, it might crash or misbehave, so a proper implementation would restart browser every once in a while or have some other error-handling.

Total amount of simultaneous browser contexts should be constrained with some reasonable number

My implementation using cdp: https://github.com/thecodingmachine/gotenberg/blob/6.0.0/internal/pkg/printer/chrome.go#L64

Related puppeteer issues:

2. Heavy load: context timeout

Under heavy load, the context timeout while waiting for some events. For instance:

No problem with this: heavy load = slow processing = timeout. BUT: after the heavy load, even sending ONE request will timeout.

In my opinion, Chrome misbehaves because it cannot recover. It feels like it's related with my first issue, but I'm not sure how.

I expected Chrome to go "back" to its normal state as all browser context, targets and so on should be closed.

--

Hope you can provide me with some insights as you have a far better understanding of the Chrome API 😄

mafredri commented 5 years ago

Hi @gulien,

thanks for a well written issue, I've only looked at this quickly but I did identify two potential causes.

First, a browser context is created but never destroyed, here: thecodingmachine/gotenberg/internal/pkg/printer/chrome.go#L82

I believe we need to add:

defer devtClient.Target.DisposeBrowserContext(ctx, target.NewDisposeBrowserContextArgs(newContextTarget.BrowserContextID))

The other potential issue is the context used when doing a defer close. For example here: thecodingmachine/gotenberg/internal/pkg/printer/chrome.go#L105

If the context times out before the defer is run, it will never send the request. The same applies to the DisposeBrowserContext above. In these cases I like to create a separate context for use in the close requests that has a longer timeout (or none at all).

gulien commented 5 years ago

Thank you @mafredri!

I've added/updated the following as suggested:

disposeBrowserContextArgs := target.NewDisposeBrowserContextArgs(newContextTarget.BrowserContextID)
defer devtClient.Target.DisposeBrowserContext(context.Background(), disposeBrowserContextArgs)
closeTargetArgs := target.NewCloseTargetArgs(newTarget.TargetID)
defer targetClient.Target.CloseTarget(context.Background(), closeTargetArgs)

Load testing results:

The memory issue is fixed! ✅

--

Unfortunately, the "heavy load" issue is still occuring.

I'm not sure it is related to your library, but maybe I've missed something. Any idea? 😄

mafredri commented 5 years ago

@gulien awesome, glad it worked out!

I'm using context.Background() currently. Do you think it is better to still have a timeout?

You can use it just fine but I'd keep an eye on it, if it leaves goroutines hanging due to no response from Chrome, it could become a problem. It's more a safety net than a must-have.

Unfortunately, the "heavy load" issue is still occuring.

Dang. Considering how cdp is used I have a hard time imagining what could go wrong inside the package. Although I've been thinking about this, and how it affects Chrome: thecodingmachine/gotenberg/internal/pkg/printer/chrome.go#L70-L78

While Chrome now-a-days technically supports multiple clients connected to the same endpoint, it might not be well tested under load. Not to mention that the browser endpoint is a bit special.

I would try to refactor that code so that one goroutine manages context/targets, and other goroutines request targets from it.

I intended for this to be just pseudo code but I copied the code over from gotenberg and restructured it a bit and it looks like something that could work 😅... not tested, but something along these lines:

func creator(requests chan chan response) error {
    devt, err := devtool.New("http://localhost:9222").Version(ctx)
    if err != nil {
        return err
    }
    // connect to WebSocket URL (page) that speaks the Chrome DevTools Protocol.
    devtConn, err := rpcc.DialContext(ctx, devt.WebSocketDebuggerURL)
    if err != nil {
        return err
    }
    defer devtConn.Close() // nolint: errcheck
    // create a new CDP Client that uses conn.
    devtClient := cdp.NewClient(devtConn)

    for ch := <-requests {
        newContextTarget, err := devtClient.Target.CreateBrowserContext(ctx)
        if err != nil {
            ch <- response{err: err}
        }
        dispose := func() error {
            return devtClient.Target.DisposeBrowserContext(context.Background(), disposeBrowserContextArgs)
        }
        // create a new blank target with the new browser context.
        createTargetArgs := target.
            NewCreateTargetArgs("about:blank").
            SetBrowserContextID(newContextTarget.BrowserContextID)
        newTarget, err := devtClient.Target.CreateTarget(ctx, createTargetArgs)
        if err != nil {
            // cleanup on error
            dispose()
            ch <- response{err: err}
        }

        ch <- response{
            wsURL: fmt.Sprintf("ws://127.0.0.1:9222/devtools/page/%s", newTarget.TargetID),
            close: func() (err error) {
                closeTargetArgs := target.NewCloseTargetArgs(newTarget.TargetID)
                err = targetClient.Target.CloseTarget(context.Background(), closeTargetArgs)
                if err != nil {
                    // cleanup on error
                    dispose()
                    return err
                }
                return dispose()
            },
        }
    }
}

func main() {
    // init the worker...
    requests := make(chan chan response, 1)
    err := creator(requests)
    if err != nil {
        panic(err)
    }

    // request a new target
    ch := make(chan response, 1)
    requests <- ch
    resp := <-ch
    if resp.err != nil {
        panic(resp.err)
    }
    defer resp.close()
    newContextConn, err := rpcc.DialContext(ctx, resp.wsURL)
    // ...
}
gulien commented 5 years ago

Thank you very much @mafredri

As far as I understand, there should be only one client connected to the devtools endpoint. Then I should request from this "singleton" new targets.

Here what I did:

type targetResponse struct {
    wsURL string
    err   error
    close func(targetClient *cdp.Client) error
}

// the singleton
var devtClient *cdp.Client

// initializes the singleton. it's called in main.go.
func MustActivateChromeDevToolsConnection() {
       // context without timeout as the singleton should live "forever". 
    ctx := context.Background()
    devt, err := devtool.New("http://localhost:9222").Version(ctx)
    if err != nil {
        panic(err)
    }
    // connect to WebSocket URL (page) that speaks the Chrome DevTools Protocol.
    devtConn, err := rpcc.DialContext(ctx, devt.WebSocketDebuggerURL)
    if err != nil {
        panic(err)
    }
       // don't close the connection.
    //defer devtConn.Close()
    // create a new CDP Client that uses conn.
    devtClient = cdp.NewClient(devtConn)
}

// our function for "requesting" target
func requestTarget(ctx context.Context, requests chan chan targetResponse) {
    for ch := range requests {
        newContextTarget, err := devtClient.Target.CreateBrowserContext(ctx)
        if err != nil {
            ch <- targetResponse{err: err}
        }
        dispose := func() error {
            disposeBrowserContextArgs := target.NewDisposeBrowserContextArgs(newContextTarget.BrowserContextID)
            return devtClient.Target.DisposeBrowserContext(context.Background(), disposeBrowserContextArgs)
        }
        // create a new blank target with the new browser context.
        createTargetArgs := target.
            NewCreateTargetArgs("about:blank").
            SetBrowserContextID(newContextTarget.BrowserContextID)
        newTarget, err := devtClient.Target.CreateTarget(ctx, createTargetArgs)
        if err != nil {
            // cleanup on error
            dispose()
            ch <- targetResponse{err: err}
        }
        ch <- targetResponse{
            wsURL: fmt.Sprintf("ws://127.0.0.1:9222/devtools/page/%s", newTarget.TargetID),
            close: func(targetClient *cdp.Client) error {
                               // target client will be created later...
                if targetClient == nil {
                    return dispose()
                }
                closeTargetArgs := target.NewCloseTargetArgs(newTarget.TargetID)
                _, err = targetClient.Target.CloseTarget(context.Background(), closeTargetArgs)
                if err != nil {
                    // cleanup on error
                    dispose()
                    return err
                }
                return dispose()
            },
        }
    }
}

func (p chromePrinter) Print(destination string) error {
    const op string = "printer.chromePrinter.Print"
    logOptions(p.logger, p.opts)
    ctx, cancel := xcontext.WithTimeout(p.logger, p.opts.WaitTimeout+p.opts.WaitDelay)
    defer cancel()
    resolver := func() error {
        targetRequests := make(chan chan targetResponse, 1)
        // we request target in a goroutine.
        go requestTarget(ctx, targetRequests)
        ch := make(chan targetResponse, 1)
        targetRequests <- ch
        resp := <-ch
        // connect the client to the new target.
        newContextConn, err := rpcc.DialContext(ctx, resp.wsURL)
        if err != nil {
            resp.close(nil)
            return err
        }
               defer newContextConn.Close()
        // create a new CDP Client that uses newContextConn.
        targetClient := cdp.NewClient(newContextConn)
        defer resp.close(targetClient)
                // ...

My implementation is surely broken, but no success so far with the "heavy load" issue.

gulien commented 5 years ago

Another solution would be to limit the number of simulatenous connections. Something like:

const MAX_CONNECTIONS = 5

var currentConnections = 0

var lock = make(chan struct{}, 1)

func (p chromePrinter) Print(destination string) error {
    const op string = "printer.chromePrinter.Print"
    logOptions(p.logger, p.opts)
    ctx, cancel := xcontext.WithTimeout(p.logger, p.opts.WaitTimeout+p.opts.WaitDelay)
    defer cancel()
    resolver := func() error {
            // print...
        }
        if currentConnections < MAX_CONNECTIONS {
            currentConnections++
            if err := resolver(); err != nil {
            currentConnections--
            return xerror.New(op, err)
         }
             currentConnections--
             return nil
        } 
        select {
    case lock <- struct{}{}:
        // lock acquired.
                currentConnections++
        if err := resolver(); err != nil {
            <-lock // we release the lock.
                        currentConnections--
            return xerror.New(op, err)
        }
        <-lock // we release the lock.
                currentConnections--
        return nil
    case <-ctx.Done():
        // failed to acquire lock before
        // deadline.
        logger.DebugOp(op, "failed to acquire lock before context.Context deadline")
        return xerror.New(op, ctx.Err())
    }

Of course I would prefer to avoid that, but if there is not another solution... 😄

--

EDIT

I've just tested it and it works fine actually. By default I limit the maximum devtools connections to 5. I think its a sane value, but I guess it might depends on the CPU and memory on the machine 🤔 any thoughts on this? 😄