CyToF Data Compensation using CATALYST

rana228 commented 1 year ago

Hi Helena,

Thank you for such a nice package.

I have used this for IMC data and now I would like to compensate CyToF data. I have debarcoded and normalized the data (not with CATALYST) and now I want to compensate this data using CATALYST. As my data is already debarcoded and normalized, I tried generating the Spillovermatrix by skipping debarcoding step as shown below:

Read FCS file fcs_path <-"Unstimulated.fcs" fcsfile <- flowCore::read.FCS(fcs_path, transformation = FALSE, truncate_max_range = FALSE)

Prepare Data sce <- prepData(fcsfile)

estimate spillover matrix sce <- computeSpillmat(sce)

ERROR: Error in computeSpillmat(sce) : !is.null(metadata(x)$bc_key) is not TRUE

I didn't find any solution to this error.

Is it mandatory to run below steps before estimating spillover although my data doesn't require these. sce <- assignPrelim(sce, bc_ms) sce <- applyCutoffs(estCutoffs(sce)) Error in optim(startVec, opfct, hessian = TRUE, method = optMethod, control = list(maxit = maxIt, : initial value in 'vmmin' is not finite

I have attached a sample file.

Looking forward to your reply.

Thank you!!

Sample_FCS.txt

HelenaLC commented 1 year ago

Preprocessing of the multiplexed sample (including debarcording) is independent of spillover estimation and compensation.
The latter is dependent on single-stained reference beads. These are (and need to be) debarcoded to assign beads to positive (stained for) channels, and estimate artifactual signal observed in negative (not stained for) channels.
So, briefly put, you need to run the beads through the debarcoding pipeline (assignPrelim, applyCutoffs) -- independent of where or not your primary data has already been processed -- and then apply the estimated spillover matrix to your data using compCytof.

rana228 commented 1 year ago

Hi Helena,

Thank you so much for your quick response and clarifying my doubts.

If I follow as mentioned by you, I am getting this error at following step:

sce <- applyCutoffs(estCutoffs(sce)) Error in optim(startVec, opfct, hessian = TRUE, method = optMethod, control = list(maxit = maxIt, : initial value in 'vmmin' is not finite

How to resolve this?

Thank you!!

HelenaLC commented 1 year ago

I am confused now... this sce is from your IMC data and the same as the one in the original issue? Or generated from separate FCS files? Just to emphasise again: there should be two independent data in the pipeline when used in full - one for the multiplexed data (here, IMC), one for single-stained beads (to estimate spillover) - and hence two SingleCellExperiment objects.

rana228 commented 1 year ago

I have CyToF data and this sce is from the same CyToF data as in the original issue. This is not IMC data, this one is single stained beads.

HelenaLC commented 1 year ago

Okay. Then, in general, please try to post more information... e.g., run the code line by line, include relevant intermediate outputs (e.g., print the SCE in the first place, how was it constructed etc.) etc. - otherwise it's very difficult to follow and provide support. Also: please enclose code in designated code-blocks (three back-ticks) to aid readability.

rana228 commented 1 year ago

Sure, here is the code which I ran.

I have used single-stained control sample as input which is already debarcoded and normalized outside CATALYST.

# get single-stained control samples
fcs_path <-"Unstimulated.fcs"
#Read FCS file
fcsfile <- flowCore::read.FCS(fcs_path, transformation = FALSE, truncate_max_range = FALSE)

#specify mass channels stained for & debarcode
bc_ms <- c(80,89,103:106,108,127,138:156, 158:176,191,208,209)
sce <- prepData(fcsfile)
sce
class: SingleCellExperiment 
dim: 64 58370 
metadata(1): experiment_info
assays(2): counts exprs
rownames(64): 80ArAr 190BCKG ... PD-1 pCREB
rowData names(4): channel_name marker_name marker_class use_channel
colnames: NULL
colData names(1): sample_id
reducedDimNames(0):
mainExpName: NULL
altExpNames(0):

sce <- assignPrelim(sce, bc_ms)
Debarcoding data...
 o ordering
 o classifying events
Normalizing...
Computing deltas...

> sce
class: SingleCellExperiment 
dim: 64 58370 
metadata(2): experiment_info bc_key
assays(3): counts exprs scaled
rownames(64): 80ArAr 190BCKG ... PD-1 pCREB
rowData names(5): channel_name marker_name marker_class use_channel is_bc
colnames: NULL
colData names(3): sample_id bc_id delta
reducedDimNames(0):
mainExpName: NULL
altExpNames(0):

sce <- applyCutoffs(estCutoffs(sce))
Error in optim(startVec, opfct, hessian = TRUE, method = optMethod, control = list(maxit = maxIt, :
initial value in 'vmmin' is not finite

I hope it's clear now.

Thank you!!

HelenaLC commented 1 year ago

Have you looked at the package documentation? There's quite a comprehensive description of the step-by-step workflow... e.g., section on compensation here... fwiw, I believe it's more helpful to have a look there first rather then trying out random fixes that get over one and lead to another issue.

Also, the vignette clearly distinguishes between beads and sample data by using different variable names. From your code, I am still not sure what is what and which step you ran on which data etc. First thing that sticks out: Comparing your code to the vignette, you cannot estimate spillover (computeSpillover(sce)) before debarcoding (applyCutoffs(estimateCutoffs(sce))), because spillover estimation relies on previously identified positive and negative populations... So, please have a look at the vignette first, perhaps try to understand a little what is happening as well before copying code, and I'll be happy to help if you still have issues; thank you.

rana228 commented 1 year ago

Yes, I am following the manual for Compensation. Sorry I didn't realize that I messed up the order of steps while posting. I have edited my previous comment. Please have a look.

I will go through the manual again and try to re-run all steps.

Thank you for your time and patience.

HelenaLC commented 1 year ago

Okay, this looks more detailed, thanks. Could you maybe split the estCutoffs and applyCutoffs step, and run traceback() once the error occurs? That might show which part of the code exactly is failing -- thanks!!

rana228 commented 1 year ago

I did that already and somehow it worked without any error when I split these steps, I don't know the exact reason. Now I have the spillover matrix. But I have few more doubts,

I have >50 FCS files to compensate, so is there a way to compensate them together in one go.
How to save compensated data to FCS
And is it ok if I generate spillover matrix based on one control sample and use that for compensating rest of the samples/FCS files.

Thank you!!

HelenaLC commented 1 year ago

I don't see why not, then again, might run into memory issues depending on how many cells there are... and there is no harm in writing a loop and doing it one-by-one.
Again, please see the vignette for how to write on FCS files from a SCE.
Not sure what you mean by "control sample", but yes, typically we use one set of beads to compensate a set of files (granted that the beads were measured in temporal proximity of the samples... else, technical differences between runs that are very far apart (in time) could yield compensation more and more inaccurate).

HelenaLC / CATALYST

CyToF Data Compensation using CATALYST #333