Open JonasTriki opened 3 years ago
Hi, have you figured it out in the meantime? I have the same question..
Hi, have you figured it out in the meantime? I have the same question..
Nope, I have just left it as 250000
for the time being!
If we are building an index offline, shouldn't it cover all the index samples ?
Hi there!
How should one go about selecting
training_sample_size
for thetree()
andscore_ah()
methods of theScannBuilder
class? The hyperparameter is not mentioned in the algorithms section. Should one leave it as default (e.g.100000
) or stick to a value similar to the one from the example notebook (e.g.250000
)? Does it depend on the dataset? In my case, I would like to build a ScaNN index on word embeddings with ~4M rows and 300 features.Thanks in advance.