A distributed approximate nearest neighborhood search (ANN) library which provides a high quality vector index build, search and distributed online serving toolkits for large scale vector search scenario.
My machine only has 64G of memory, and I cannot build a 128G vector data set.The BalancedDataPartition program may solve my problem, as I can divide the 128G data into multiple parts and build them on multiple machines.However, BalancedDataPartition also reads the entire data set into memory and performs clustering, and my 64G memory still cannot meet this condition.I hope BalancedDataPartition program can support multiple batches of reads to complete clustering with small memory. Thank you very much!
My machine only has 64G of memory, and I cannot build a 128G vector data set.The BalancedDataPartition program may solve my problem, as I can divide the 128G data into multiple parts and build them on multiple machines.However, BalancedDataPartition also reads the entire data set into memory and performs clustering, and my 64G memory still cannot meet this condition.I hope BalancedDataPartition program can support multiple batches of reads to complete clustering with small memory. Thank you very much!