sbinet / go-python

naive go bindings to the CPython2 C-API
Other
1.53k stars 138 forks source link

Linux RSS infintely increasing though go heap is empty #41

Closed mpoornima closed 8 years ago

mpoornima commented 8 years ago

Hi,

I'm not sure what exactly is happening here but the resident memory occupied by the golang process never seems to get free though there is nothing in the heap. Here is a sample code:

package main

import python "github.com/sbinet/go-python"

func main() {
    initPython()

    for {
        //Infinite loop, make some data and create python dict
        message := createMessage()
        messageDict := getPyDict(message)
        messageDict.Clear()
    }
}

func createMessage() map[string]interface{} {
    payload := make(map[string]interface{})
    payload["key_1"] = "value_1"
    payload["key_2"] = "value_2"
    payload["key_3"] = "value_3"
    return payload
}

func getPyDict(message map[string]interface{}) *python.PyObject {
    messageDict := python.PyDict_New()
    for key, value := range message {
        pyKey := python.PyString_FromString(key)
        pyValue := python.PyString_FromString(value.(string))
        python.PyDict_SetItem(messageDict, pyKey, pyValue)
    }

    return messageDict
} 

func initPython() {
    //Init Python
    if err := python.Initialize(); err != nil {
          panic(err)
    }
}

The above script takes 1GB of RSS within a minute and the memory usage keeps increasing and doesn't seem to go down (golang heap is 0MB). Here are my local box details:

uname -a

Linux AE-LP-059 4.2.0-25-generic #30-Ubuntu SMP Mon Jan 18 12:31:50 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

go version

go version go1.6 linux/amd64

go env

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/mpoornima/Work/go"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GO15VENDOREXPERIMENT="1"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0"
CXX="g++"
CGO_ENABLED="1"
sbinet commented 8 years ago

messageDict.Clear() will clear the content of the dict (ie: remove (eventually) its entries) but memory will still be retained by the backing structure of the dict. you need to call PyObject.Decref() on it:

func main() {
    initPython()

    for {
        //Infinite loop, make some data and create python dict
        message := createMessage()
        messageDict := getPyDict(message)
        messageDict.Clear()
        messageDict.Decref()
    }
}

that's because PyDict_New() returns a new reference that should be owned by the calling code (and the calling code should take care of disposing it after use.)

see: https://docs.python.org/2/c-api/intro.html#objects-types-and-reference-counts

hth, -s

(feel free to reopen if something's astray)

mpoornima commented 8 years ago

Nope. Made no difference. Still the same, even after enabling Python GC manually via their API https://docs.python.org/2/library/gc.html (called gc.enable() function).

sbinet commented 8 years ago

you're right. there are 2 other errors, leading to memory leaks:

with these modifications, the following program:

package main

import python "github.com/sbinet/go-python"

func main() {
    initPython()

    for {
        //Infinite loop, make some data and create python dict
        message := createMessage()
        messageDict := getPyDict(message)
        python.PyDict_Clear(messageDict)
        messageDict.DecRef()
    }
}

func createMessage() map[string]interface{} {
    payload := make(map[string]interface{})
    payload["key_1"] = "value_1"
    payload["key_2"] = "value_2"
    payload["key_3"] = "value_3"
    return payload
}

func getPyDict(message map[string]interface{}) *python.PyObject {
    messageDict := python.PyDict_New()
    for key, value := range message {
        pyKey := python.PyString_FromString(key)
        pyValue := python.PyString_FromString(value.(string))
        python.PyDict_SetItem(messageDict, pyKey, pyValue)
        pyKey.DecRef()
        pyValue.DecRef()
    }

    return messageDict
}

func initPython() {
    //Init Python
    if err := python.Initialize(); err != nil {
        panic(err)
    }
}

stabilizes at 12.5m RSS on my 64b linux machine.

mpoornima commented 8 years ago

Cool. Works !!!. Thanks a lot, have been struggling with this for a while. I see that there is no corresponding function PyDict_Clear() available for Lists or Tuples. Does that mean that Clear() and subsequent DecRef() on the objects would be sufficient?

sbinet commented 8 years ago

yes.

mpoornima commented 8 years ago

Noticed that Clear() infact calls DecRef() on the object after setting the pointer to null. So calling Clear() was enough for all other objects except for the dict where I had to make another extra call to PyDict_Clear().