Closed grst closed 5 years ago
Have you seen this one? https://www.ncbi.nlm.nih.gov/pubmed/29608177
yes, scanorama (https://www.biorxiv.org/content/early/2018/07/17/371179) is actually a generalization of that approach from what I understood. The limitation of MNN is that it
ok, great.
On Thu, 25 Oct 2018 at 10:10 Gregor Sturm notifications@github.com wrote:
yes, scanorama (https://www.biorxiv.org/content/early/2018/07/17/371179) is actually a generalization of that approach from what I understood. The limitation of MNN is that it
- depends on the order of the integration of the datasets
- does not work well if not at least one cell population exists across all datasets.
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/grst/single_cell_data_integration/issues/3#issuecomment-432955545, or mute the thread https://github.com/notifications/unsubscribe-auth/ABVg3bYo1zHC1OSlDnNFcvqE1xK9Gl0Eks5uoXHzgaJpZM4X1qv2 .
This just came out. Seems interesting: https://www.biorxiv.org/content/early/2018/10/31/457879
Claims to be even better and faster than scanorama: Harmony
https://www.biorxiv.org/content/biorxiv/early/2018/11/05/461954.full.pdf?%3Fcollection=
It probably makes sense to merge the datasets at an earlier stage:
Will update the overview at the top of this issue shortly.
I approve of the early merging. This might help us for filtering downstream
0. gene expression quantification (#8)
where no counts are provided, weneed todo the preprocessing from FASTQ files ourselves.1. consistent format (
01_process_counts
)2. data cleaning (
02_data_cleaning
) (#5)filter each dataset indiviually (min/max genes, percent_mito, ...)
3. data merging and confounder removal (#5)
4. batch effect removal (#7)
5. clustering, cell type identification, ...