pb33f / libopenapi

libopenapi is a fully featured, high performance OpenAPI 3.1, 3.0 and Swagger parser, library, validator and toolkit for golang applications.
https://pb33f.io/libopenapi/
Other
472 stars 64 forks source link

Memory leak in libopenapi #46

Closed TristanSpeakEasy closed 1 year ago

TristanSpeakEasy commented 1 year ago

I am actively investigating this on my end trying to see if I am misusing the library somewhere but after switching from another library to libopenapi we have notice quite a large memory leak.

We use the lib both on a server and in a long running CLI command that iterates over thousands of OpenAPI documents, and have experienced OOM sigkills.

Here is a top10 from pprof of our CLI towards the end of its run:

3394.52MB 47.98% 47.98%  3395.02MB 47.99%  gopkg.in/yaml%2ev3.(*parser).node
 1185.63MB 16.76% 64.74%  1185.63MB 16.76%  reflect.mapassign_faststr
  670.35MB  9.47% 74.21%  3334.26MB 47.13%  gopkg.in/yaml%2ev3.(*parser).scalar
  421.93MB  5.96% 80.18%   421.93MB  5.96%  os.ReadFile
     359MB  5.07% 85.25%      359MB  5.07%  encoding/json.Marshal
  331.97MB  4.69% 89.94%   368.50MB  5.21%  github.com/pb33f/libopenapi/datamodel/low/base.(*SchemaProxy).Schema
  186.04MB  2.63% 92.57%  4326.40MB 61.15%  gopkg.in/yaml%2ev3.(*parser).parseChild
  163.51MB  2.31% 94.88%   163.51MB  2.31%  reflect.makemap
      88MB  1.24% 96.13%       89MB  1.26%  gopkg.in/yaml%2ev3.(*decoder).scalar
      74MB  1.05% 97.17%       74MB  1.05%  gopkg.in/yaml%2ev3.read_line

let me know if there is anything I can provide to help track down the issue

daveshanley commented 1 year ago

Interesting, do you have a repo I could test against? How many runs before you see the issue?

TristanSpeakEasy commented 1 year ago

We have this repo which contains all the documents we are looping through https://github.com/speakeasy-api/openapi-directory/tree/main/APIs.

I am just seeing if I can put together a minimal reproducible now of the issue.

It kicks in almost immediately when we start iterating through the docs, we can see the memory usage growing quite steadily

TristanSpeakEasy commented 1 year ago

here is a repoducible

package main

import (
    "io"
    "log"
    "net/http"

    _ "net/http/pprof"

    "github.com/pb33f/libopenapi"
)

func read() []byte {
    res, err := http.Get("https://docs.speakeasyapi.dev/openapi.yaml")
    if err != nil {
        panic(err)
    }
    defer res.Body.Close()

    data, err := io.ReadAll(res.Body)
    if err != nil {
        panic(err)
    }

    return data
}

func main() {
    go func() {
        log.Println(http.ListenAndServe("localhost:8080", nil))
    }()

    data := read()

    for i := 0; i < 100000; i++ {
        doc, err := libopenapi.NewDocument(data)
        if err != nil {
            panic(err)
        }

        model, errs := doc.BuildV3Model()
        if len(errs) > 0 {
            panic(errs)
        }

        for _, item := range model.Model.Paths.PathItems {
            for _, op := range item.GetOperations() {
                for _, param := range op.Parameters {
                    if param.Schema != nil {
                        _, err := param.Schema.BuildSchema()
                        if err != nil {
                            panic(err)
                        }
                    }
                }
            }
        }
    }
}

after just running this for 10 seconds or so I get this pprof dump

go tool pprof heap.out 
File: scratch
Type: inuse_space
Time: Dec 13, 2022 at 4:15pm (GMT)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top10
Showing nodes accounting for 2.56GB, 98.09% of 2.61GB total
Dropped 123 nodes (cum <= 0.01GB)
Showing top 10 nodes out of 46
      flat  flat%   sum%        cum   cum%
    1.46GB 55.90% 55.90%     1.46GB 55.92%  gopkg.in/yaml%2ev3.(*parser).node
    0.50GB 19.21% 75.11%     0.50GB 19.21%  reflect.mapassign_faststr
    0.21GB  8.12% 83.23%     1.37GB 52.44%  gopkg.in/yaml%2ev3.(*parser).scalar
    0.11GB  4.06% 87.29%     0.11GB  4.18%  encoding/json.Marshal
    0.09GB  3.57% 90.86%     1.80GB 69.07%  gopkg.in/yaml%2ev3.(*parser).parseChild
    0.08GB  3.07% 93.93%     0.08GB  3.07%  reflect.makemap
    0.04GB  1.42% 95.35%     0.04GB  1.42%  gopkg.in/yaml%2ev3.(*decoder).scalar
    0.03GB  1.33% 96.68%     0.03GB  1.33%  gopkg.in/yaml%2ev3.read_line
    0.03GB  0.99% 97.67%     0.03GB  1.05%  gopkg.in/yaml%2ev3.yaml_emitter_emit
    0.01GB  0.41% 98.09%     0.21GB  7.87%  gopkg.in/yaml%2ev3.(*decoder).sequence
TristanSpeakEasy commented 1 year ago

just as an interesting comparison doing a similar test with kin-openapi sees very little memory usage

package main

import (
    "io"
    "log"
    "net/http"

    _ "net/http/pprof"

    "github.com/getkin/kin-openapi/openapi3"
)

func read() []byte {
    res, err := http.Get("https://docs.speakeasyapi.dev/openapi.yaml")
    if err != nil {
        panic(err)
    }
    defer res.Body.Close()

    data, err := io.ReadAll(res.Body)
    if err != nil {
        panic(err)
    }

    return data
}

func main() {
    go func() {
        log.Println(http.ListenAndServe("localhost:8080", nil))
    }()

    data := read()

    for i := 0; i < 100000; i++ {
        doc, err := openapi3.NewLoader().LoadFromData(data)
        if err != nil {
            panic(err)
        }

        for _, path := range doc.Paths {
            for _, op := range path.Operations() {
                for _, param := range op.Parameters {
                    if param.Value.Schema != nil {
                        _, err := param.Value.Schema.MarshalJSON()
                        if err != nil {
                            panic(err)
                        }
                    }
                }
            }
        }
    }
}
go tool pprof heap1.out                                
File: scratch
Type: inuse_space
Time: Dec 13, 2022 at 4:27pm (GMT)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top10
Showing nodes accounting for 2561.56kB, 100% of 2561.56kB total
Showing top 10 nodes out of 35
      flat  flat%   sum%        cum   cum%
    1025kB 40.01% 40.01%     1025kB 40.01%  runtime.allocm
  512.31kB 20.00% 60.01%   512.31kB 20.00%  encoding/pem.Decode
  512.20kB 20.00% 80.01%   512.20kB 20.00%  runtime.malg
  512.05kB 19.99%   100%   512.05kB 19.99%  runtime.acquireSudog
         0     0%   100%   512.31kB 20.00%  crypto/tls.(*Conn).HandshakeContext
         0     0%   100%   512.31kB 20.00%  crypto/tls.(*Conn).clientHandshake
         0     0%   100%   512.31kB 20.00%  crypto/tls.(*Conn).handshakeContext
         0     0%   100%   512.31kB 20.00%  crypto/tls.(*Conn).verifyServerCertificate
         0     0%   100%   512.31kB 20.00%  crypto/tls.(*clientHandshakeStateTLS13).handshake
         0     0%   100%   512.31kB 20.00%  crypto/tls.(*clientHandshakeStateTLS13).readServerCertificate
daveshanley commented 1 year ago

This is very helpful, thank you. I will keep you posted as I investigate.

daveshanley commented 1 year ago

This should be fixed in v0.4.3, let me know how you get on.

TristanSpeakEasy commented 1 year ago

can confirm that this looks to be fixed. Thanks Dave