yahoojapan / NGT

Nearest Neighbor Search with Neighborhood Graph and Tree for High-dimensional Data
Apache License 2.0
1.22k stars 112 forks source link

file descriptor leak on `index.build_index` #140

Closed drbh closed 1 year ago

drbh commented 1 year ago

calling index.build_index successively seems to cause a file descriptor leak.

When calling insert and build many times (~120) some file is not closed and this causes ngt to crash. Below is the output of a small test that inserts and builds.

...

insert:     113
num_files_open: 292

insert:     114
num_files_open: 294

insert:     115
num_files_open: 296
Traceback (most recent call last):
  File "/Users/drbh/Projects/ngt-rs/pytmp/main.py", line 31, in <module>
    num_files_open = get_number_of_open_files_by_pid(pid)
  File "/Users/drbh/Projects/ngt-rs/pytmp/main.py", line 8, in get_number_of_open_files_by_pid
    stream = os.popen(cmd)
  File "/usr/local/Cellar/python@3.9/3.9.17/Frameworks/Python.framework/Versions/3.9/lib/python3.9/os.py", line 983, in popen
    proc = subprocess.Popen(cmd,
  File "/usr/local/Cellar/python@3.9/3.9.17/Frameworks/Python.framework/Versions/3.9/lib/python3.9/subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/local/Cellar/python@3.9/3.9.17/Frameworks/Python.framework/Versions/3.9/lib/python3.9/subprocess.py", line 1736, in _execute_child
    errpipe_read, errpipe_write = os.pipe()
OSError: [Errno 24] Too many open files

This can be reproduced with the following script

import ngtpy
import random
import os
import sys

def get_number_of_open_files_by_pid(pid):
  cmd = "lsof -p " + str(pid) + " | wc -l"
  stream = os.popen(cmd)
  output = stream.read()
  return int(output)

dim = 10
nb = 1_000
vectors = [[random.random() for _ in range(dim)] for _ in range(nb)]

ngtpy.create(b"tmp", dim)
index = ngtpy.Index(b"tmp")

# print current process pid
pid = os.getpid()
print("pid: " + str(pid))

for i in range(0, nb):
    # do insert build_index save
    index.insert(vectors[i])
    index.build_index()
    index.save()

    # print number of open files
    num_files_open = get_number_of_open_files_by_pid(pid)
    print("\ninsert:\t\t" + str(i))
    print("num_files_open:\t" + str(num_files_open))

Additionally this can also be reproduced in ngt-rs here: https://github.com/lerouxrgd/ngt-rs/pull/12

I believe this is a file descriptor issue because running lsof after each build_index shows a growing number of open files. Upon further inspection these files are all dev/null however I am not sure where this file is opened and why it is not closed.

Please let me know if I can provide any more information! Thank you for the awesome project!

masajiro commented 1 year ago

I have released v2.0.14 which fixes this issue. I really appreciate your sample source code, which made it easy for me to find the cause of this issue.

BTW, I know this sample source code is to reproduce this issue, but just to be sure, there is no need to call build_index and save for every insertion. The functions can be called only once at the end of the insertion as follows.

for i in range(0, nb):
    index.insert(vectors[i])
index.build_index()
index.save()
drbh commented 1 year ago

@masajiro wow thank you so much for the quick response and fix, the new release completely resolved my issue 🙏

Thanks again for the great project!