How to use distribute server?

microsoft / SPTAG

A distributed approximate nearest neighborhood search (ANN) library which provides a high quality vector index build, search and distributed online serving toolkits for large scale vector search scenario.

MIT License

4.77k stars 581 forks source link

How to use distribute server? #401

Closed suppersam1 closed 9 months ago

suppersam1 commented 9 months ago

I see that the aggregator can aggregate the indexes of multiple machines, but it seems that there is no method for distributing the indexes on different machines. Is it the user who partitions the data and then creates indexes on different data partitions to distribute them on different machines, and finally aggregates them through the aggregator?Or does it create an index for all the data and then distribute the index on each machine?

suppersam1 commented 9 months ago

I resolved.

./balanceddatapartition -d 10 -v float -i test_index_input.txt -f TXT -c 3
./balanceddatapartition -d 10 -v float -i test_index_input.txt -f TXT -c 3 -g LocalPartition -o test_partition
Move files from these partitions to different machines.
Use indexbuilder on different machines to index partitioned data
Each machine starts the server and load index files.
Launch aggregator to connect services on various machines
Use the client to connect to the aggregator and start querying

Funlxy commented 3 months ago

I resolved.

./balanceddatapartition -d 10 -v float -i test_index_input.txt -f TXT -c 3

./balanceddatapartition -d 10 -v float -i test_index_input.txt -f TXT -c 3 -g LocalPartition -o test_partition

Move files from these partitions to different machines.

Use indexbuilder on different machines to index partitioned data

Each machine starts the server and load index files.

Launch aggregator to connect services on various machines

Use the client to connect to the aggregator and start querying

Hello, i have same problem, what is the difference between step 1 and step 2?