Closed suhaasa closed 3 years ago
Hey Suhaas!
SAM by default uses the cosine distance metric between cells for calculating nearest neighbors, which uses a hyper-fast kNN solver implemented by hnswlib
. Unfortuantely, hnswlib
is not seedable, so using the same seed in SAM may result in slightly different kNN's. All other distance metrics are implemented by PyNNDescent
, which does allow seeding.
If you try using sam.run(distance='correlation', seed=0)
, do you get reproducible UMAPs?
Let me know! Alec
Thanks for the quick response! That did the trick - I did not know hnswlib was not seedable. I may stick with the cosine distance metric since I think it produces better projections visually for our data. Excited to fully applying your method!
Hey Alec,
My lab is interested in using your algorithm (and maybe SAMap in the future). I'm trying to set a seed to get the same umap projection everytime, but setting the seed in the run function is still producing a different scatter plot everytime. I think clustering is also different, but I haven't done extensive testing there.
Thoughts?
Suhaas