WojciechMula / pyahocorasick

Python module (C extension and plain python) implementing Aho-Corasick algorithm
BSD 3-Clause "New" or "Revised" License
929 stars 122 forks source link

memory leak #73

Closed hxsnow10 closed 6 years ago

hxsnow10 commented 6 years ago

my python version is 2.7,5 aahocorasick.so is copyed from git.

#encoding=utf-8`
import ahocorasick

class AC(object):

    def __init__(self, keys):
        self.keys=keys
        A = ahocorasick.Automaton()
        for key in keys:
            A.add_word(key, key)
        A.make_automaton()
        self.A=A

    def get_locss(self, text):
        if not self.keys: return []
        locss=[[key,end_idx-len(key)+1,end_idx+1] for end_idx, key in self.A.iter(text)]
        return locss

if __name__=="__main__":
    a=AC(["你","wo "])
    s="你妹啊。。。。"
    for i in range(100000000000):
        a.get_locss(s)

Test this script, i get

[root@bgsbtsp0050-hw thrift]# ps aux|grep AC
root     135953  116  9.7 781375308 12795560 pts/2 R+ 14:56   0:03 python AC.py
root     135955  0.0  0.0 110212   876 pts/3    S+   14:56   0:00 grep --color=auto AC
[root@bgsbtsp0050-hw thrift]# ps aux|grep AC
root     135953  117 12.5 781375308 16506536 pts/2 R+ 14:56   0:04 python AC.py
root     135957  0.0  0.0 110212   880 pts/3    S+   14:56   0:00 grep --color=auto AC
[root@bgsbtsp0050-hw thrift]# ps aux|grep AC
root     135953  114 14.2 781375308 18790056 pts/2 R+ 14:56   0:05 python AC.py
root     135959  0.0  0.0 110212   880 pts/3    S+   14:56   0:00 grep --color=auto AC
[root@bgsbtsp0050-hw thrift]# ps aux|grep AC
root     135953  109 15.6 781375308 20646556 pts/2 R+ 14:56   0:06 python AC.py
root     135961  0.0  0.0 110212   880 pts/3    S+   14:56   0:00 grep --color=auto AC
[root@bgsbtsp0050-hw thrift]# ps aux|grep AC
root     135953  105 16.1 781375308 21227176 pts/2 R+ 14:56   0:07 python AC.py
root     135963  0.0  0.0 110212   880 pts/3    S+   14:56   0:00 grep --color=auto AC
[root@bgsbtsp0050-hw thrift]# ps aux|grep AC
root     135953  107 16.9 781375308 22333936 pts/2 R+ 14:56   0:08 python AC.py
root     135965  0.0  0.0 110212   880 pts/3    S+   14:56   0:00 grep --color=auto AC
WojciechMula commented 6 years ago

@hxsnow10 In order to check if there are any memory leaks you should run the python's garbage collector.