RedisGraph / redisgraph-go

A Golang client for redisgraph
https://redisgraph.io
BSD 3-Clause "New" or "Revised" License
132 stars 38 forks source link

Leak if not reading the output of `Query` function #35

Open sankarp-kavach opened 3 years ago

sankarp-kavach commented 3 years ago

A sample program:

package main

import (
    "fmt"
    "net/http"
    _ "net/http/pprof"
    "time"

    "github.com/gomodule/redigo/redis"
    rg "github.com/redislabs/redisgraph-go"
    "github.com/sirupsen/logrus"
)

func process() {
    for {
        queryStr := fmt.Sprintf("MATCH (X) RETURN X")

        _ = query("mesh7Graph", queryStr)
        logrus.Println("blah")
    }
}

func query(graphName, query string) error {
    conn := pool.Get()
    defer conn.Close()

    g := rg.GraphNew(graphName, conn)
    if false {
        _, err := g.Query(query)
        if err != nil {
            logrus.Errorf("GraphDB query error: %v", err)
            return err
        }
    } else {
        res, err := g.Query(query)
        if err != nil {
            logrus.Errorf("GraphDB query error: %v", err)
            return err
        }

        for res.Next() {
            _ = res.Record()
        }
    }

    return nil
}

var pool *redis.Pool

const (
    address  = "<REDIS_SERVER_URL>:6379"
    password = "<REDIS_PASSWORD>"
)

func main() {
    pool = &redis.Pool{
        MaxIdle:     100,
        MaxActive:   100,
        IdleTimeout: 240 * time.Second,
        // Dial or DialContext must be set.
        // When both are set, DialContext takes precedence over Dial.
        Dial: func() (redis.Conn, error) {
            c, err := redis.Dial("tcp", address)
            logrus.Tracef("Dialing redis @ %v", address)
            if err != nil {
                logrus.Errorf("Dial failed. Error: %v", err)
                return nil, err
            }
            if _, err := c.Do("AUTH", password); err != nil {
                logrus.Errorf("Authorization failed. Error: %v", err)
                c.Close()
                return nil, err
            }
            return c, nil
        },
        TestOnBorrow: func(c redis.Conn, t time.Time) error {
            if time.Since(t) < time.Minute {
                return nil
            }
            _, err := c.Do("PING")
            return err
        },
    }

    go process()

    logrus.Fatal(http.ListenAndServe("localhost:8080", nil))
}

Now run the above program toggling the if block with false and true. The program will just continuously do some redisgraph calls and not parse the response in one case, will parse the response in another case.

When the above program is running, do:

curl -sK -v http://localhost:8080/debug/pprof/heap > heap.out
go tool pprof heap.out
(pprof) lines
(pprof) web

In the case when we ignore the output of Query function, we get a memory leak in the pprof output. When the response is parsed and the record are read, there is no leak reported in the pprof output.

The docs do not mention anything about reading the Query function output. There is no close function on the QueryResult struct either, if I have to call it via defer. There is no other function except Query (like an Exec in the SQL) which I can do to not read results.

I will be happy to provide any other debugging information if you want.

sankarp-kavach commented 3 years ago

To minimize the effects of this leak, I tried to create a single instance of the Graph object, via the following code. But it consistently crashes, proving that Graph object as returned by rg.GraphNew is not thread-safe. This is an important information that must be documented. But the docs do not mention about this.

type t struct {
    graph rg.Graph
}

var p *t

main() {
    conn := pool.Get()
    p = &t{
        graph: rg.GraphNew("test-graph", conn),
    }
    go f(1)
    go f(2)
        <-block_forever
}

func f(n int) {
    for {
        _, err := p.graph.Query("MATCH (X) RETURN X")
        if err != nil {
            logrus.Fatal(err)
            return
        }
        logrus.Println("Thread: ", n)
    }
}
swilly22 commented 3 years ago

Hi @sankarp-kavach, Thank you for reporting this, I'll update here on my progress

sankarp-kavach commented 3 years ago

I did some more analysis and found that we embed a connection object inside the graph response struct which is possibly what is causing the leak. We seem to have needed the conn object to parse the response body. But once the parsing is done, after that we do not require the conn object. But we did not had a way to destroy/finalise that connection.

Also, I could be wrong as my debugging was not very thorough.