HelenaLC / CATALYST

Cytometry dATa anALYsis Tools
66 stars 31 forks source link

Using HARMONY Batch-corrected data for downstream analysis on CATALYST #364

Closed satsumayo closed 1 year ago

satsumayo commented 1 year ago

Hello, I have used Harmony to batch-correct spectral cytometry data which is a SingleCellExperiment object made using CATALYST's prepData( ). There is no common reference between the batches, so have not used CytoNorm. I would like to then continue on with clustering and visualisation of the batch-corrected data using CATALYST.

The PCA and Harmony batch-corrected data is stored in reducedDims. I would be grateful for some advice on what I have to specify to be able to use the Harmony-embedded data downstream. For example: plotExprHeatmap( ) of the SingleCellExperiment still uses the exprs rather than the batch-corrected values. assayNames( ) are still "counts" and "exprs". Should I be expecting to see "HARMONY" appear under assayNames( ) too?

Any advice would be appreciated. Thanks so much!

HelenaLC commented 1 year ago

I believe this depends on how you run harmony. By default, HarmonyMatrix() will only return an corrected embedding (which correctly goes into reducedDims). To also obtain corrected expression values, you'll have to specify return_object = TRUE, extract the corresponding data, and assign it into the object (see ?HarmonyMatrix for details). E.g., one approach would be to keep raw data as counts, uncorrected expression values as exprs0, and integrated data as exprs such that plots will default to the latter. Overall, this is not a CATALYST-related issue, so please refer to harmon's and SingleCellExperiment's documentation in case you require further details on data handling.