Closed camelest closed 3 years ago
I've got the same question. This results in different doublets:
Hi @camelest and @f6v -- thanks for reaching out.
Yes, the each run of DoubletFinder will be slightly distinct due to the randomness of artificial doublet generation and downstream neighborhood detection. When I see bimodal bcmvn distributions, I usually interrogate each threshold and use my knowledge of the dataset to choose the correct one. I'll note from your plots above that while the amplitude of the peaks is different between the runs, the actual locations of the peaks are the same (e.g., 0.09 and 0.26). So I would try DoubletFinder using these two parameters and then look deeply into the data to choose the right one (it should be somewhat obvious, e.g., if one of the pK values results in a lot of 'known' singlets being called as doublets).
Chris
Hi, Chris @chris-mcginnis-ucsf
Thank you so much for your reply. I just want to confirm one thing: I understood that the find.pK contained some randomness. Do we get identical doublet results if we use the same input of pK (and pN and expected doublet rate) or is there any randomness in this part as well? Thank you for your kind help.
Hi, first of all, thank you so much for maintaining this wonderful tool.
I have a question regarding the reproducibility of the optimal pK identification.
I have a 9,500 cells single-cell dataset and I followed your tutorial. When I ran the DoubletFinder, it gave me a curve of the first figure. I felt it's not typical to get the 2 peaks of pK, and then for confirmation, repeated the analysis and got the second figure. I didn't change the Seurat object it self and I was wondering whether it's possible that DoubletFinder gives different peaks of pks on different run.
My questions are:
Thank you so much in advance for your help.
Best