If a gene is excluded in one sample, it will be excluded in the integrated dataset (which uses common genes).
This is particularly problematic as if will prevent the detection of genes that are, for example, not expressed in control samples and highly expressed in test samples.
This issue arrises because we create
SeuratObject
's for each sample, removing features that are not expressed in a minimum of 3 cells:https://github.com/hms-dbmi-cellenics/pipeline/blob/bffe4d6b64a9482fcc97a171da5573f40ed4a9c2/pipeline-runner/R/gem2s-5-create_seurat.R#L47-L56
If a gene is excluded in one sample, it will be excluded in the integrated dataset (which uses common genes).
This is particularly problematic as if will prevent the detection of genes that are, for example, not expressed in control samples and highly expressed in test samples.