mattn / go-adodb

Microsoft ActiveX Object DataBase driver for go that using exp/sql
http://mattn.kaoriya.net/
MIT License
141 stars 36 forks source link

memory leaking in Query method #28

Open ws6 opened 7 years ago

ws6 commented 7 years ago

Hi mattn, it seems the Query function has memory leaking issue. https://github.com/mattn/go-adodb/blob/master/adodb.go#L61 even I did a defer rows.Close()

The leaking correlated to number of queries.

here is how I used it. Appreciated your help.

 rows, err := db.Query(query)
    if err != nil {
        return nil, err
    }
    defer rows.Close() 
    for rows.Next() {
        if err := rows.Scan(
mattn commented 7 years ago

What does your compairing? Before Query and After Query? Or many calls of Query? Or many Rows?

ws6 commented 7 years ago

many Query called, each Query with full Scan. Can it be this? func (s *AdodbStmt) Close() error { s.s.Release() return nil }

nevcook commented 7 years ago

Hi mattn, I'm seeing the same problem. I'll try to give you as much detail as I can think of...

Each query I open increases my Windows Working set memory and (other than irregular small reductions) it just keeps going up and up.

I've been able to confirm that this memory is not part of the Go garbage collected memory. Go still thinks it's only using a few MB when my process Working set is heading over 500MB (and higher, if I let it run long enough).

I've tried everything I can think of, including not only closing the rows, but also closing the database connection between each query, and even calling CoUninitialize between each query. Nothing I do seems to make any difference to the memory usage. It just keeps climbing until the process is terminated.

The structure of my code is basically the same as the snippet ws6 described above. I can provide a detailed example if you want one.

I've eliminated everything else in the loop from being the cause by removal and deduction: The bottom line is that when I comment out the call to Query()l, and the loop through the rows, the memory leak goes away. If I comment out just the loop through the rows, but still call Query() then it still leaks.

I'll try to answer the questions you asked ws6 above. I'm not sure what the first two of your questions above are asking, but by the time I've run about 3000 queries the working set has reached about 420MB. It's the same query each time, and in these tests, it's been the same query results with 2346 rows returns for each query.

Please help. This process is supposed to be long running (i.e. continuous) so a memory leak like this is devastating. Unless we can solve the problem, it's a complete road block for the use of Go in this project, and I really would like to stick with Go if I can.

If there's any other information I can give you to help diagnose the problem, please ask and I'll be happy to provide it.

nevcook commented 7 years ago

I thought it best to provide some sample code to allow you to duplicate this easily. It's a modification of your mdb.go example. For me the process Working set memory gets to about 75MB by about 6000 iterations of the query loop. Sorry if I've made any mistakes. I'm relatively new to Go. (By the way, I'm using 32 bit Go, as I can't make this work at all as 64 bit)

package main

import (
    "database/sql"
    "fmt"
    "log"
    "os"
    "time"

    // This is only here to make go give the process an expanded stack, which is needed for ADO to work correctly.
    // Without this the query loops below crash consistently after about 21 or 42 iterations (but not anything in between).
    _ "runtime/cgo"

    "github.com/go-ole/go-ole"
    "github.com/go-ole/go-ole/oleutil"
    _ "github.com/mattn/go-adodb"
)

func createMdb(f string) error {
    unk, err := oleutil.CreateObject("ADOX.Catalog")
    if err != nil {
        return err
    }
    cat, err := unk.QueryInterface(ole.IID_IDispatch)
    if err != nil {
        return err
    }
    _, err = oleutil.CallMethod(cat, "Create", "Provider=Microsoft.Jet.OLEDB.4.0;Data Source="+f+";")
    if err != nil {
        return err
    }
    return nil
}

func main() {
    ole.CoInitialize(0)

    f := "./example.mdb"

    os.Remove(f)

    err := createMdb(f)
    if err != nil {
        fmt.Println("create mdb", err)
        return
    }

    db, err := sql.Open("adodb", "Provider=Microsoft.Jet.OLEDB.4.0;Data Source="+f+";")
    if err != nil {
        fmt.Println("open", err)
        return
    }

    _, err = db.Exec("create table foo (id int not null primary key, name text not null, created datetime not null)")
    if err != nil {
        fmt.Println("create table", err)
        return
    }

    tx, err := db.Begin()
    if err != nil {
        fmt.Println(err)
        return
    }
    stmt, err := tx.Prepare("insert into foo(id, name, created) values(?, ?, ?)")
    if err != nil {
        fmt.Println("insert", err)
        return
    }
    defer stmt.Close()

    for i := 0; i < 100; i++ {
        _, err = stmt.Exec(i, fmt.Sprintf("こんにちわ世界%03d", i), time.Now())
        if err != nil {
            fmt.Println("exec", err)
            return
        }
    }
    tx.Commit()

    // First tried the query in a loop.
    // NOTE: For me this loop crashes after about 580 iterations with a "Cannot open any more tables" error.
    //   That's a separate problem I'm going to be following up if I can get the memory leak fixed.
    // for i := 1; ; i++ {
    //  fmt.Printf("%d.", i)
    //  err := queryRows(db)
    //  if err != nil {
    //      log.Print(err)
    //      break
    //  }
    //  time.Sleep(10 * time.Millisecond)
    // }

    // Next tried re-opening the DB for each query.
    // This version has the same memory leak, but doesn't crash (at least not in the first 6000+ iterations)
    db.Close()
    for i := 1; ; i++ {
        fmt.Printf("%d.", i)
        err = queryRowsClosingDB(f)
        if err != nil {
            log.Print(err)
            break
        }
        time.Sleep(10 * time.Millisecond)
    }
}

func queryRows(db *sql.DB) error {
    rows, err := db.Query("select id, name, created from foo")
    if err != nil {
        return fmt.Errorf("select error: %s", err)
    }
    defer rows.Close()

    for rows.Next() {
        var id int
        var name string
        var created time.Time
        err = rows.Scan(&id, &name, &created)
        if err != nil {
            return fmt.Errorf("scan error: %s", err)
        }
        //fmt.Println(id, name, created)
    }

    return nil
}

func queryRowsClosingDB(f string) error {
    db, err := sql.Open("adodb", "Provider=Microsoft.Jet.OLEDB.4.0;Data Source="+f+";")
    if err != nil {
        return fmt.Errorf("open error: %s", err)
    }
    defer db.Close()

    return queryRows(db)
}
mattn commented 7 years ago

Anyone, could you please try some tests with removing this part?

https://github.com/mattn/go-adodb/blob/master/adodb.go#L61-L70

nevcook commented 7 years ago

I'm really glad you are still looking at this.

I wasn't sure exactly what you wanted to test, but I've taken a crack at it.

I first made another temporary change to that function, just to confirm that my modifications were being compiled and used. Then I tried commenting out the specified lines (i.e. the whole function) and re-ran the test routine I posted above for 6000 cycles.

Unfortunately it made no significant difference to the memory usage with or without that function commented.

mattn commented 7 years ago

Hmm, How about this?

diff --git a/adodb.go b/adodb.go
index a739d0a..0fd1ac6 100644
--- a/adodb.go
+++ b/adodb.go
@@ -116,6 +116,7 @@ func (c *AdodbConn) Close() error {
    if err != nil {
        return err
    }
+   c.db.Release()
    c.db = nil
    ole.CoUninitialize()
    return nil
@@ -297,6 +298,7 @@ func (rc *AdodbRows) Close() error {
    if err != nil {
        return err
    }
+   rc.rc.Release()
    return nil
 }
nevcook commented 7 years ago

That change made a difference, but was not the whole problem. Over a few iterations the average memory usage fell from around 75MB to around 67MB. (That was with the previous modification still in place.)

It's 6:30pm here and I'll be going out for the evening shortly, but I should still have time to try another round when I get back if you have any other thoughts.

I wish I understood more about how this package works so I could help more directly, but I'm new to Go, and the insides of how the sql database processing works are a little beyond my understanding at the moment.

mattn commented 7 years ago

Thank you. I fixed some. https://github.com/mattn/go-adodb/commit/09a39b6019f3c7e438a8a6f93e9ea17af186f598 Could you please try this again?

nevcook commented 7 years ago

Those changes have helped quite a lot.

Averaging over a few iterations, the memory usage for 6000 cycles of the query loop fell from about 75MB originally to around 45MB with the latest version. This is definitely on the right track.

But it's still steadily increasing the longer the loop runs. Before the loop starts (after creating the database) it's at 17MB. After 6000 cycles it's up to 45MB, and after 12000 cycles it's up to 64MB.

Extra information: In my test code (see above) I've been using the version of the query loop that opens and closes the database for every cycle, because (as noted in the comments), not closing the database after every loop caused the loop to fail after about 580 cycles with a "Cannot open any more tables" error.

Just as a test, I tried that version of the query loop again with the latest modifications. Unfortunately it still fails after 580 cycles with the same error, so there's obviously something important that's not being disposed of correctly when you dont explicitly close the database. Hopefully this might give you some idea about what might be going on.

chuyangyang commented 6 years ago

Has the problem been solved? I met the same problem and expected the problem to be solved.

ws6 commented 6 years ago

just use this driver - github.com/denisenkom/go-mssqldb if your mssql is not 2008R1. it is well tested. And also I have my querybuilder library can work with mssql, postgres and mysql with the same interface. it is light-weighted and simple to use. (sorry no docs, but you can easily extend it yourself) msi

ws6 commented 6 years ago

package mssql

import ( "testing"

"fmt"

_ "github.com/denisenkom/go-mssqldb"
"github.com/ws6/msi"

)

func getDb() (*msi.Msi, error) { connString := fmt.Sprintf("server=%s;database=%s;user id=%s;password=%s", "host", "dbname", "user", "password", ) return msi.NewDb("mssql", connString, "dbname", "") }

func TestLoad(t *testing.T) {

schema, err := getDb()

if err != nil {
    t.Fatal(err.Error())
}
defer schema.Close()

} func TestPlainQuery(t *testing.T) { schema, err := getDb() defer schema.Close() if err != nil { t.Fatal(err.Error()) } defer schema.Close()

projectTable := schema.GetTable("project")
if projectTable == nil {
    t.Fatal("no project table")
}
q, err := projectTable.FindQuery(map[string]interface{}{`id`: 123})
if err != nil {
    t.Fatal(err.Error())
}
t.Log(q)

}

func TestJoinQuery(t *testing.T) { schema, err := getDb() defer schema.Close() if err != nil { t.Fatal(err.Error()) } defer schema.Close()

sampleTable := schema.GetTable("sample')
if sampleTable == nil {
    t.Fatal("no sample table")
}
q, err := sampleTable.FindQuery(
    map[string]interface{}{"id": 123},
    map[string]interface{}{msi.POPULATES: []string{"project_id"}},
)
if err != nil {
    t.Fatal(err.Error())
}
t.Log(q)

countQuery, err := sampleTable.CountQuery(
    map[string]interface{}{"id": 123},
    map[string]interface{}{msi.POPULATES: []string{"project_id"}},
)
if err != nil {
    t.Fatal(err.Error())
}
t.Log(countQuery)
//Test CountQuery

page, err := sampleTable.GetPage(
    msi.M{},
    map[string]interface{}{msi.POPULATES: []string{`project_id`}},
) 

if err != nil {
    t.Fatal(err.Error())
}
t.Logf("%+v", page)
inserts := msi.M{"project_id": 1, "sample_name": "LP_1abacsd"}
if err := sampleTable.Insert(inserts); err != nil {
    t.Log(err.Error())
} else {
    t.Logf("inserted")
}

found, err := sampleTable.FindOne(inserts)
if err != nil {
    t.Fatal(err.Error())
}

t.Logf("%+v", found)

updates := msi.M{`project_id`: 2}
if err := sampleTable.Update(msi.M{"id": found["id"]}, updates); err != nil {
    t.Fatal(err.Error())
} else {
    t.Log("updated")
}
if err := sampleTable.Remove(msi.M{"id": found["id"]}); err != nil {
    t.Fatal(err.Error())
}

t.Log("removed")

}

nevcook commented 6 years ago

@ws6 : Thanks for trying, but your suggestion misses the point. Although the original question doesn't specifically say so (I didn't write that), we are only using ADO here because we are needing to access the data in msaccess .mdb files, not in a proper database like mssql. (Please see the example code above.) If we were using mssql there would be other options that would bypass the need for ADO, and if we had the option of moving this data into a proper database, we would have already done that long ago.

ws6 commented 6 years ago

sorry I missed your point. I suggest to either export msaccess to other datastorage or using some specific drivers.

https://github.com/bennof/accessDBwE ps: I just googled, not tested.

nevcook commented 6 years ago

@ws6: I wish it was that easy. bennof/accessDBwE is just a small library that uses mattn/go-adodb as it's driver. I've tried all other drivers that I could find (as of 6 months ago). As I said above, if we had the option of moving the data, we would have done that long ago. The data is an integral part of another application that we have no control over.

ws6 commented 6 years ago

If slowness is not a big concern, and you still want to use this lib, my last suggestion is : Build a binary using this lib. use another thread to call the binary query 6k times and kill it. pass your query from either command line args or stdin pipe? this will definitely IO and CPU/Mem insufficient way.

mattn commented 6 years ago

I updated adodb.go. Anyone, Could you please try again?

jessehl commented 3 years ago

I'm facing issues with leaking memory, too. I'm not 100% sure if it relates to this thread, but I'm posting it here anyway in case it is of any help.

In my case, I'm using this package to query data from a DB2 database (on an AS400 / iSeries). I'm using Microsoft's OLE DB Provider, and I'm running this package on a Windows machine. My connection string (parts of it) looks like this:

connectionString = "Provider=DB2OLEDB;Network Transport Library=TCPIP;" 
db, err = sql.Open("adodb", connectionString)
...

In my case, memory leaks occur only for BLOB-columns (although I haven't verified VARBINARY, I can assure that no leaks occur for strings, ints, floats, times, dates). The issue is over here:

case 205: // ADLONGVARBINARY
    sa := (*ole.SafeArray)(unsafe.Pointer(uintptr(val.Val)))
    conv := &ole.SafeArrayConversion{sa}
    elems, err := conv.TotalElements(0)

The memory leak disappears when I finish this case statement with:

conv.Release()

Without this call, the SafeArray is never destroyed - i.e. this line in go-ole is never called.

I hope this is of any help!

Now, some side-info (just for completeness' sake). The whole story is that the OLE-DB connector I'm using is not working as I expect it to for BLOBs in the first place. elems, err := conv.TotalElements(0) always returns approx. 32k, regardless of the actual size of the field. This naturally leads to errors when processing BLOBs that are smaller than 32k, as the connector assumes that it can fetch 32k bytes from memory that are not actually there (nil pointer dereference). This may also be related to this issue, although I guess in my case it's definitley an issue in Microsoft's connector, not in this package. I've managed to work around these issues (don't ask 😭). In addition, I've had to call oleutil.CallMethod(field, "GetChunk", <size>) to ensure that all bytes of the BLOB were received on the client, when more than 32k bytes are in the field (see: here).

Thanks for open-sourcing this!