Closed Bryant-Yang closed 10 months ago
Interesting, the hooks should be generated only once in any case. Would be curious if you can get a small repro?
I tried getting a repro:
from attrs import define
from cattrs import GenConverter
c = GenConverter()
@define
class Test:
a: int
b: int
c: int
while True:
c.structure({"a": 1, "b": 2, "c": 3}, Test)
but it doesn't leak.
This issue is found in one of my test server which has little memory, in a scheduled task loop which cause memory usage too high, had to reboot after running 2 days.
I have made my converter usage as singleton. It's just not a clear constraint that this converter object could not be gc collected in some situations.
I can give a simple test as below:
import gc
from datetime import datetime
from typing import List
import cattr
import time
from attr import attrs
def create_converter(detailed_validation):
converter = cattr.GenConverter(detailed_validation=detailed_validation)
converter.register_structure_hook(datetime,
lambda val, _: datetime.strptime(val, '%Y-%m-%d %H:%M:%S')
if val is not None else None)
return converter
@attrs(auto_attribs=True)
class TestData:
@attrs(auto_attribs=True)
class InfoList:
name: str
name: str
dt: datetime
info_list: List[InfoList]
def test(detailed_validation):
converter = create_converter(detailed_validation)
data = {'name': "testAAA", "dt": datetime.now().strftime('%Y-%m-%d %H:%M:%S'), "info_list": [{'name': "testAAA"}] * 100}
for _ in range(1000):
converter.structure(data, TestData)
if __name__ == '__main__':
import os
import psutil
process = psutil.Process(os.getpid())
while True:
test(detailed_validation=True) # memory usage increasing
# test(detailed_validation=False) # memory usage not increasing
gc.collect()
time.sleep(.1)
print(f'mem info: {process.memory_info().rss}')
Interesting, the hooks should be generated only once in any case. Would be curious if you can get a small repro?
Ah, you seem to be creating a converter in a loop. I strongly advise not doing that, not only with cattrs, but also with libraries like attrs and dataclasses. All of these libraries have heavy initialization steps (including code generation) to make them faster afterwards; if you're constantly creating new converters you're paying the price and getting no benefit.
This is probably a leak somewhere in linecache, yeah. The hooks are generated once per converter. I can take a look when I have spare cycles, but due to what I said in the first paragraph I think this is lower prio.
yes,don't create instance repeatedly if not necessary,agree that.
This should actually already be fixed by #464
while True:
creating GenConverter instance with all default params.
the memory usage of process keep increasing.
If set detailed_validation = False, there is no memory leak. I'm not sure if it cost by line cache.