I'm using Seurat with Harmony for batch correction in my scRNA-seq analysis, and I have a question regarding the regression of multiple covariates.
Background:
I want to regress out three covariates from my data:
Library
SampleType
CellCyclePhase
Initially, I attempted to regress out all three covariates in parallel by concatenating the corresponding metadata columns, split the merged object, and providing that to Harmony. It fails at splitting, because of too small / empty categories.
Splitting ‘counts’, ‘data’ layers. Not splitting ‘scale.data’. If you would like to split other layers, set in `layers` argument.
Error in validObject(object = object) :
invalid class “Assay5” object: Layers must be two-dimensional objects
I understand that small categories will also be a problem for correction, even if I fix the failing data split.
Not sure how I can solve this:
Ignore some covariates
Subset to SampleType 1, and keep covariates (Library, CellCyclePhase). Repeat for s.t.2. Suboptimal.
Regress out cell cycle scores in ScaleData(), and provide covariates SampleType and Library to Harmony. (or variations thereof)
One issue is that regression in ScaleData() works much less well then Harmony to remove differences.
(related to #262)
Iterative / Sequential / Serial Harmony corrections.
I recall that the Harmony authors discussed a "serial Harmony" approach, where covariates are corrected sequentially rather than in parallel, but I haven't been able to re-find that discussion again.
My Questions:
Is there a recommended practice for handling situations where (concatenating covariates leads to / there are) too many, and sparse categories?
(other than don't do it)
Can I legitimately overcome the small categories problem by sequential Harmony, and should result in equivalent results to parallel regression in Harmony (assuming both are possible)?
Could sequential regression help mitigate issues arising from sparse category combinations?
How can I implement sequential regression of covariates in Harmony within Seurat?
Feed "harmony" reduction into RunHarmony() instead of "pca" at the 2nd and 3rd variable?
Are there recommended workflows or code examples for applying Harmony multiple times, each time correcting for a single covariate?
Do I need to adjust Harmony parameters, e.g Library has 25 categories, Phase has 3.
Hello,
I'm using Seurat with Harmony for batch correction in my scRNA-seq analysis, and I have a question regarding the regression of multiple covariates.
Background:
I want to regress out three covariates from my data:
Initially, I attempted to regress out all three covariates in parallel by concatenating the corresponding metadata columns, split the merged object, and providing that to Harmony. It fails at splitting, because of too small / empty categories.
I understand that small categories will also be a problem for correction, even if I fix the failing data split.
Not sure how I can solve this:
ScaleData()
, and provide covariates SampleType and Library to Harmony. (or variations thereof)ScaleData()
works much less well then Harmony to remove differences.I recall that the Harmony authors discussed a "serial Harmony" approach, where covariates are corrected sequentially rather than in parallel, but I haven't been able to re-find that discussion again.
My Questions:
Is there a recommended practice for handling situations where (concatenating covariates leads to / there are) too many, and sparse categories?
(other than don't do it)
Can I legitimately overcome the small categories problem by sequential Harmony, and should result in equivalent results to parallel regression in Harmony (assuming both are possible)?
How can I implement sequential regression of covariates in Harmony within Seurat?
RunHarmony()
instead of "pca" at the 2nd and 3rd variable?Additional Context:
Thank you for your time taken.