Closed nicokuzak closed 6 months ago
Interesting! Among other potential reasons, this might be because of the new (experimental) implementation of k-means without faiss. If you index with the use_faiss=True
argument passed, do you get deterministic scores?
@bclavie the two notebooks are different even when using use_faiss=True
Ok I figured it out. There was a set(
in my preprocessing of the text data 🤦 . I'll check out if use_faiss
actually makes a difference here but it was on me - sorry!
Hello! I have a notebook where I am creating an
index
and then searching it a bunch of different times for different queries to evaluatecolbertv2
on my data.I duplicated the notebook, seeded it, yet I get different scores and different results in the two notebooks. Is this a known limitation? Is there anything I need to do for determinism in the models?