microsoft / SPTAG

A distributed approximate nearest neighborhood search (ANN) library which provides a high quality vector index build, search and distributed online serving toolkits for large scale vector search scenario.
MIT License
4.83k stars 581 forks source link

Segmentation fault when building SIFT1B #416

Open GaryLee1101 opened 3 months ago

GaryLee1101 commented 3 months ago

When I constructing the index for SIFT1B, the process consistently encounters a Segmentation fault after approximately 20 hours of operation. image

I suspect the recurring Segmentation fault might be related to memory constraints, but I have allocated 512GB of memory and 32 cores in PVE for this container. Interestingly, even when using the learning subset of SIFT1B, which contains only 100 million vectors, the same issue occurs.

Here is my build script:

[Base]
ValueType=UInt8
DistCalcMethod=L2
IndexAlgoType=BKT
Dim=128
VectorPath=/home/gary/Code/DiskANN/build/data/bigann/bigann_learn.bbin
VectorType=DEFAULT
QueryPath==/home/gary/Code/DiskANN/build/data/bigann/bigann_query.bbin
QueryType=DEFAULT
WarmupPath==/home/gary/Code/DiskANN/build/data/bigann/bigann_query.bbin
WarmupType=DEFAULT
TruthPath==/home/gary/Code/DiskANN/build/data/bigann/bigann_gt.ibin
TruthType=DEFAULT
IndexDirectory=/home/gary/Code/SPTAG/gary_test/sift1b_index_learn

[SelectHead]
isExecute=true
TreeNumber=1
BKTKmeansK=32
BKTLeafSize=8
SamplesNumber=1000
SaveBKT=false
SelectThreshold=10
SplitFactor=6
SplitThreshold=25
Ratio=0.12
NumberOfThreads=32
BKTLambdaFactor=1.0

[BuildHead]
isExecute=true
NeighborhoodSize=32
TPTNumber=32
TPTLeafSize=2000
MaxCheck=16324
MaxCheckForRefineGraph=16324
RefineIterations=3
NumberOfThreads=45
BKTLambdaFactor=-1.0

[BuildSSDIndex]
isExecute=true
BuildSsdIndex=true
InternalResultNum=64
ReplicaCount=8
PostingPageLimit=3
NumberOfThreads=45
MaxCheck=16324
TmpDir=/tmp/

[SearchSSDIndex]
isExecute=true
BuildSsdIndex=false
InternalResultNum=96
NumberOfThreads=1
HashTableExponent=4
ResultNum=10
MaxCheck=1024
MaxDistRatio=8.0
SearchPostingPageLimit=3