The goal is to divide the full atlas into subsets for detailed annotation. We decided to go for a hierarchical approach to achieve higher resolution on the immune cell compartment.
Cluster full atlas at high resolution (res=1.5) and label each cluster with most abundant annotated cell type
Group clusters into data "splits"
Save log-norm data for each split (so pre-scaling, feature selection, ridge regression etc): saved as /nfs/team205/ed6/data/Fetal_immune/PAN.A01.v01.entire_data_normalised_log.wGut.batchCorrected_20210118.SUBSETNAME.h5ad
Preprocess + batch correct data subset [script]: output is saved as /nfs/team205/ed6/data/Fetal_immune/PAN.A01.v01.entire_data_normalised_log.wGut.batchCorrected_20210118.SUBSETNAME.batchCorrected.h5ad
What are the splits
See slides illustrating splitting and output
Outstanding problems
Even when subsetting based on clustering there is "spill-over" between splits e.g. in the B cell split I still have some NK/T cells that then cluster separately post-integration. Is it ok to remove these cells from a subset post-hoc?
@suochenqu @Issacgoh let me know what you think of the results
The goal is to divide the full atlas into subsets for detailed annotation. We decided to go for a hierarchical approach to achieve higher resolution on the immune cell compartment.
Steps
res=1.5
) and label each cluster with most abundant annotated cell type/nfs/team205/ed6/data/Fetal_immune/PAN.A01.v01.entire_data_normalised_log.wGut.batchCorrected_20210118.SUBSETNAME.h5ad
/nfs/team205/ed6/data/Fetal_immune/PAN.A01.v01.entire_data_normalised_log.wGut.batchCorrected_20210118.SUBSETNAME.batchCorrected.h5ad
notebooks/PFI_subset_EDA
What are the splits See slides illustrating splitting and output
Outstanding problems
@suochenqu @Issacgoh let me know what you think of the results