harsha-simhadri / big-ann-benchmarks

Framework for evaluating ANNS algorithms on billion scale datasets.
https://big-ann-benchmarks.com
MIT License
313 stars 103 forks source link

Unable to run filter track #290

Closed yudhiesh closed 2 months ago

yudhiesh commented 2 months ago

I am trying to leverage this framework to benchmark other Vector Databases for my own understanding. I am trying to run the filter track in order to understand the data flow for new custom databases but I am running into issues.

Steps to reproduce:

  1. python create_dataset.py --dataset yfcc-10
  2. python install.py --neurips23track filter --algorithm faiss
  3. python run.py --algorithm faiss --neurips23track filter --dataset yfcc-10M

It fails at step 3 with the following error logs:

yfcc-10Mfile data/yfcc100M/query.public.100K.u8bin already exists
file data/yfcc100M/GT.public.ibin already exists
file data/yfcc100M/query.private.2727415019.100K.u8bin already exists
file data/yfcc100M/GT.private.2727415019.ibin already exists
file data/yfcc100M/base.10M.u8bin.crop_nb_10000000 already exists
file data/yfcc100M/base.metadata.10M.spmat already exists
file data/yfcc100M/query.metadata.public.100K.spmat already exists
file data/yfcc100M/query.metadata.private.2727415019.100K.spmat already exists
2024-04-26 11:13:28,209 - annb - INFO - running only faiss
2024-04-26 11:13:28,387 - annb - INFO - Order: [Definition(algorithm='faiss', constructor='FAISS', module='neurips23.filter.faiss.faiss', docker_tag='neurips23-filter-faiss', docker_volumes=[], arguments=['euclidean', {'indexkey': 'IVF16384,SQ8', 'binarysig': True, 'threads': 16}], query_argument_groups=[[{'nprobe': 1, 'mt_threshold': 0.0003}], [{'nprobe': 4, 'mt_threshold': 0.0003}], [{'nprobe': 16, 'mt_threshold': 0.0003}], [{'nprobe': 32, 'mt_threshold': 0.0003}], [{'nprobe': 64, 'mt_threshold': 0.0003}], [{'nprobe': 96, 'mt_threshold': 0.0003}], [{'nprobe': 1, 'mt_threshold': 0.0001}], [{'nprobe': 4, 'mt_threshold': 0.0001}], [{'nprobe': 16, 'mt_threshold': 0.0001}], [{'nprobe': 32, 'mt_threshold': 0.0001}], [{'nprobe': 64, 'mt_threshold': 0.0001}], [{'nprobe': 96, 'mt_threshold': 0.0001}], [{'nprobe': 1, 'mt_threshold': 0.01}], [{'nprobe': 4, 'mt_threshold': 0.01}], [{'nprobe': 16, 'mt_threshold': 0.01}], [{'nprobe': 32, 'mt_threshold': 0.01}], [{'nprobe': 64, 'mt_threshold': 0.01}], [{'nprobe': 96, 'mt_threshold': 0.01}]], disabled=False)]
RW Namespace(dataset='yfcc-10M', count=10, definitions='algos-2021.yaml', algorithm='faiss', docker_tag=None, list_algorithms=False, force=False, rebuild=False, runs=5, timeout=43200, max_n_algorithms=-1, power_capture='', t3=False, nodocker=False, upload_index=False, download_index=False, blob_prefix=None, sas_string=None, private_query=False, neurips23track='filter', runbook_path='neurips23/streaming/simple_runbook.yaml')
Setting container wait timeout to 43200 seconds
2024-04-26 11:13:30,375 - annb.aa3f2151e8d1 - INFO - Created container aa3f2151e8d1: CPU limit 0-11, mem limit 5662968576, timeout 43200, command ['--dataset', 'yfcc-10M', '--algorithm', 'faiss', '--module', 'neurips23.filter.faiss.faiss', '--constructor', 'FAISS', '--runs', '5', '--count', '10', '--neurips23track', 'filter', '["euclidean", {"indexkey": "IVF16384,SQ8", "binarysig": true, "threads": 16}]', '[{"nprobe": 1, "mt_threshold": 0.0003}]', '[{"nprobe": 4, "mt_threshold": 0.0003}]', '[{"nprobe": 16, "mt_threshold": 0.0003}]', '[{"nprobe": 32, "mt_threshold": 0.0003}]', '[{"nprobe": 64, "mt_threshold": 0.0003}]', '[{"nprobe": 96, "mt_threshold": 0.0003}]', '[{"nprobe": 1, "mt_threshold": 0.0001}]', '[{"nprobe": 4, "mt_threshold": 0.0001}]', '[{"nprobe": 16, "mt_threshold": 0.0001}]', '[{"nprobe": 32, "mt_threshold": 0.0001}]', '[{"nprobe": 64, "mt_threshold": 0.0001}]', '[{"nprobe": 96, "mt_threshold": 0.0001}]', '[{"nprobe": 1, "mt_threshold": 0.01}]', '[{"nprobe": 4, "mt_threshold": 0.01}]', '[{"nprobe": 16, "mt_threshold": 0.01}]', '[{"nprobe": 32, "mt_threshold": 0.01}]', '[{"nprobe": 64, "mt_threshold": 0.01}]', '[{"nprobe": 96, "mt_threshold": 0.01}]']
2024-04-26 11:13:38,102 - annb.aa3f2151e8d1 - INFO - ['euclidean', {'indexkey': 'IVF16384,SQ8', 'binarysig': True, 'threads': 16}]
2024-04-26 11:13:38,103 - annb.aa3f2151e8d1 - INFO - Trying to instantiate neurips23.filter.faiss.faiss.FAISS(['euclidean', {'indexkey': 'IVF16384,SQ8', 'binarysig': True, 'threads': 16}])
2024-04-26 11:13:38,485 - annb.aa3f2151e8d1 - INFO - {'indexkey': 'IVF16384,SQ8', 'binarysig': True, 'threads': 16}
2024-04-26 11:13:38,485 - annb.aa3f2151e8d1 - INFO - Running faiss on yfcc-10M
2024-04-26 11:13:38,485 - annb.aa3f2151e8d1 - INFO - preparing binary signatures
2024-04-26 11:14:29,903 - annb.aa3f2151e8d1 - INFO - writing to data/yfcc-10M.IVF16384,SQ8.binarysig
2024-04-26 11:14:33,352 - annb.aa3f2151e8d1 - INFO - train
2024-04-26 11:14:37,448 - annb.aa3f2151e8d1 - ERROR - ['euclidean', {'indexkey': 'IVF16384,SQ8', 'binarysig': True, 'threads': 16}]
Trying to instantiate neurips23.filter.faiss.faiss.FAISS(['euclidean', {'indexkey': 'IVF16384,SQ8', 'binarysig': True, 'threads': 16}])
{'indexkey': 'IVF16384,SQ8', 'binarysig': True, 'threads': 16}
Running faiss on yfcc-10M
preparing binary signatures
writing to data/yfcc-10M.IVF16384,SQ8.binarysig
train

2024-04-26 11:14:37,448 - annb.aa3f2151e8d1 - ERROR - Child process for container aa3f2151e8d1returned exit code 137 with messag

Any idea if I am missing anything here?

yudhiesh commented 2 months ago

exit code 137 indicates its a memory error. I tried increasing the memory available on docker to 16GB and 12CPU. Which still throws the same error, therefore just to get things running I tried reducing the number of vectors to load from 10M -> 100K by changing the code here:

xb = ds.get_dataset()[:100_000, :]
print(xb.shape)
print("train")
index.train(xb)

But I noticed that this does not fix it nor does it print the shape of the dataset to confirm that its working. Diving into the Dockerfile for faiss, I do not see the faiss.py file being copied anywhere. I checked the base neurips23 docker image as well and it too does not copy the code over.

yudhiesh commented 2 months ago

I was able to get it to work within a VM Instance on AWS, in order to make sure the changes are added to the new docker image I just reran python install.py --neurips23track filter --algorithm faiss prior to running the benchmark.