welch-lab / liger

R package for integrating and analyzing multiple single-cell datasets
GNU General Public License v3.0
391 stars 78 forks source link

How to improving similarity between snRNA-seq and snATAC-seq nuclei, by tuning parameters like 'k', 'lambda'? #297

Open wangmeijiao opened 10 months ago

wangmeijiao commented 10 months ago

Hi, I found it is easily to integrate snRNA and snATAC nuclei with satisfactory consistency between data modalities (which is what we expected), if less then 10000 nuclei were fed. However if large multiomic dataset with huge (>100,000) nuclei, there were difficulties to get good result. snATAC nuclei and snRNA nuclei tend to split far way with few overlap.

My question is , do I miss something of importance in all these steps (var gene selection, normalization, scale_not_center, online_iNMF/optimize_ALS, quantile norm and UMAP)? Could you please suggest some start points to tune ? I have struggled for days and learn a lot the parameters and I tuned each parameters.

I am almost lost my mind, please help!

P.S. I opened the same issue in pyliger, forgive me !

Meijiao

wangmeijiao commented 10 months ago

Oh, I selected genes from the snRNA variable genes only, which is recommended.

jw156605 commented 10 months ago

One thing to check is the reference dataset. By default liger uses the dataset with the largest number of cells as the reference during quantile normalization. But when aligning RNA and ATAC data, you should use the RNA dataset as the reference regardless of size.


Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

wangmeijiao commented 10 months ago

Wow! that is definitely a good point! Thanks for the message. I will tell the 'quantile_norm' to use my snRNA data as reference.

mvfki commented 7 months ago

I've had Josh's recommendation addressed in a way that reference will be automatically chosen from an RNA dataset if the modalities is properly set when creating a liger object, and a warning will be issued when the reference chosen is of other modalities. This is now available in master branch and will be on CRAN hopefully next week.