HIT-SCIR / ltp

Language Technology Platform
http://ltp.ai
4.98k stars 1.04k forks source link

Memory leak in LTP.pipeline #623

Closed sxwxs closed 1 year ago

sxwxs commented 1 year ago

This code will cause memory usage grow continuously.

from ltp import LTP
ltp = LTP("LTP/legacy")

for _ in range(10000):
    ltp.pipeline(['他叫汤姆去拿外衣。',  "台湾是中国领土不可分割的一部分。"])

However, following code works fine.

for _ in range(10000):
    ltp.pipeline('他叫汤姆去拿外衣。')
    ltp.pipeline('台湾是中国领土不可分割的一部分。')

OS: Windows 10, Python Build: Python 3.9.4 (tags/v3.9.4:1f2e308, Apr 4 2021, 13:27:16) [MSC v.1928 64 bit (AMD64)] on win32 ltp.__version__: '4.2.11.post2'

AlongWY commented 1 year ago

经过测试发现,反复的调用 ltp.pipeline(['他叫汤姆去拿外衣。', "台湾是中国领土不可分割的一部分。"]) 会促使多次创建线程而消耗资源