welch-lab / liger

R package for integrating and analyzing multiple single-cell datasets
GNU General Public License v3.0
381 stars 78 forks source link

KL divergence is not plateauing #229

Closed nfancy closed 3 years ago

nfancy commented 3 years ago

Hi

Thanks for developing this great package. For one of my dataset, I was trying suggestk and even after trying a K of up to 100, the KL divergence slope is not plateauing. Is this expected? My dataset is a really large one. 350k nuclei 87 samples. The plot is attached.

liger_k_5_100

cgao90 commented 3 years ago

Hi,

suggestK serves as a heuristic and it is possible that there is no obvious plateau (Just like we may not find the optimal "elbow" in a Scree plot for PCA.). You could try continuing the analysis with a few choices of K (base on the prior knowledge about the complexity of the data) and compare the integration results, and then decide on a proper K.

nfancy commented 3 years ago

Hi, Thank you very much. Do you have a suggested range for K?

Thanks.

cgao90 commented 3 years ago

Hi, 20-50 would work for many cases.