Minimum example, increases in RAM usage by about 100KB/sec on my machine
import gdxpds
import pandas as pd
SIZE = 10
def make_big():
y = [("a", i) for i in range(SIZE)]
x = pd.DataFrame(y, columns=["i", "value"])
gdxpds.to_gdx({"x": x}, "big.gdx")
def repeated_read():
while True:
x = gdxpds.to_dataframe("big.gdx", "x")
make_big()
repeated_read()
I looked into this using objgraph and found that atexit register was storing cleanup, which required storing the GDX object and associated symbols.
import gdxpds
import pandas as pd
import objgraph
SIZE = 10
def make_big():
y = [("a", i) for i in range(SIZE)]
x = pd.DataFrame(y, columns=["i", "value"])
gdxpds.to_gdx({"x": x}, "big.gdx")
def repeated_read():
x = gdxpds.to_dataframe("big.gdx", "x")
objgraph.show_growth(limit=20)
for _ in range(10):
x = gdxpds.to_dataframe("big.gdx", "x")
objgraph.show_growth(limit=20)
obj = objgraph.by_type("GdxFile")[10]
objgraph.show_backrefs(obj, max_depth=10)
make_big()
repeated_read()
Producing a graph showing why a particular python object (here GdxFile) had been saved.
Commenting out atexit.register in Gdx prevented the memory leak in the first case.
Minimum example, increases in RAM usage by about 100KB/sec on my machine
I looked into this using objgraph and found that atexit register was storing cleanup, which required storing the GDX object and associated symbols.
Producing a graph showing why a particular python object (here GdxFile) had been saved.
Commenting out
atexit.register
in Gdx prevented the memory leak in the first case.