Dealing with Batch effect corrected data

phoebee-h commented 4 years ago

Hi all, My question is: What's the input of SingleR (or other automatic cell annotation tools) if data were batch-corrected? If the recommended input of SingleR is the count matrix (e.g. SeuratObject[["RNA"]]@counts), it seems that the result of SingleR will still be the same even after batch correction. The batch correction only influences the tSNE/UMAP clustering result, and thus cells annotated with the same cell type may be separated in different clusters. It's quite confusing.

Any ideas would be appreciated. Thanks in advance!

friedue commented 4 years ago

cells annotated with the same cell type may be separated in different clusters.

That would indicate that either the batch correction screwed up or that the different clusters capture somewhat profound difference between cells of the same cell type

phoebee-h commented 4 years ago

@friedue thank you for your quick response! But I wonder what if the expression indeed affected by batch effect? Why wouldn't we apply the corrected data in this situation?

LTLA commented 4 years ago

What's the input of SingleR (or other automatic cell annotation tools) if data were batch-corrected?

Counts, counts, counts. (Or monotonic transformations thereof.)

Don't give SingleR() batch-corrected values, there is no guarantee that the correction preserves the ordering of expression values within each cell. In fact, there is no guarantee that the corrected expression values retain relevant biological information, see my comments here.

To me, the only purpose for corrected expression values is to (i) generate a common set of clusters across all batches for easier interpretation and (ii) make a pretty t-SNE plot. All other analyses that can use the original counts should do so. For example, DE analyses should use the original counts (or log-expression values) and block on the batch to account for the batch effect.

But I wonder what if the expression indeed affected by batch effect? Why wouldn't we apply the corrected data in this situation?

Now, that's the thing. SingleR is already correcting for a batch effect - the difference between your single-cell dataset and the references! These are big technical differences: if you're using the in-built references, then we're comparing single-cell data with microarrays. That's ancient history right there. I mean, that's a technology almost as old as me (or more, depending on how you count it).

If SingleR can deal with these massive technological differences, your little between-batch effects are nothing in comparison. So just let SingleR figure it out. As mentioned above, giving it corrected data may actually make the situation worse because the correction imposes many assumptions about which subpopulations should match up across batches.

The batch correction only influences the tSNE/UMAP clustering result

I have to chip in here. Clustering is an independent process from visualization by t-SNE/UMAP. It is generally not a good idea to cluster on the t-SNE/UMAP coordinates, see arguments here.

and thus cells annotated with the same cell type may be separated in different clusters.

Happens on occasion. For example, annotation separates T cells by CD4+ and CD8+, but the clustering might partition them into naive/stimulated grouping because that's the bigger axis of variation. No one's doing anything wrong here, both procedures are doing their job but the annotation has access to prior biological knowledge (through the sweat and tears of whoever generated and annotated the reference dataset) and the clustering does not.

Personally, I think it's more interesting when the two don't match up, as this tells me that there is something novel in the dataset that isn't captured by the existing annotation. If it's exactly the same... well, congratulations, you've just recapitulated known biology.

phoebee-h commented 4 years ago

Thank you so much! It's more clear to me.

SingleR is already correcting for a batch effect

For some reason, I'm still using R 3.6.0 with SingleR v1.0.1. I saw the newest version of SingleR compatible with R 3.6.0 is v1.0.6. Do they make any differences in correcting the batch effect? (I mean, any logical changes in the code were designed?)

LTLA commented 4 years ago

Nothing changed in SingleR(), if that's what you're worried about.

friedue commented 4 years ago

Just to summarize Aaron's eloquent reply:

But I wonder what if the expression indeed affected by batch effect?

You may need to take a step back to think about the conceptual differences between the different methods you're mentioning, including dimensionality reductions, DE tests, clustering, and SingleR.

SingleR looks at every single cell individually and compares it to the reference data. You could run it with just one cell in your matrix and still get a result. It does not care whether the expression values in 2, say, T cells of your sample differ slightly since the comparison of interest for SingleR is the one between one cell from your data set to the reference. This is why Aaron pointed out that SingleR has already proven itself of being capable of extracting meaningful signal despite the technical differences that will definitely exist between that one cell from your sample and the reference data that you're comparing it to.

Why wouldn't we apply the corrected data in this situation?

Because any type of correction makes certain assumptions and inevitably introduces noise/artifacts. Since there are constantly new kids on (for) the block in terms of normalization etc., it is much safer to develop tools that will know how to deal with the least processed data, i.e. counts.

phoebee-h commented 4 years ago

Got it. Thank you all for the clear explanation, which is really really helpful.

You could run it with just one cell in your matrix and still get a result.

According to this, what I would further want to ask is: If I want to apply the scores after "Fine-tuning step", where could I find the result? Seems that the slot of SingleR only saves the "No Fine-tuning" score https://github.com/LTLA/SingleR/issues/26#issuecomment-525008846. How do I convince a certain cell was labeled with a promising reference after fine-tuning?

j-andrews7 commented 4 years ago

The Annotation Diagnostics section of the vignette has some methods for determining if your labels generally make sense.

dtm2451 commented 4 years ago

To be a bit more specific, there are more slots in the resultant DataFrame object than I had described in the comment linked above. One is a $tuning.scores slot and this is where you can find the best ("first") and "second" best scores from the fine-tuning steps. The first score will be the score associated with the assigned cell type (the labels for that cell).

To prune calls based on on these scores, you can use pruneScores() and provide a minimum difference to the min.diff.next input. But note that applying a cutoff based on these values is not always recommended. When there are very similar cell types in the ref (like memory T cells versus central memory T cells) it makes sense for the scores of such labels to be similar.

phoebee-h commented 4 years ago

Thank you all again. Not sure if I understand it correctly. Here's what I see:

ref <- SingleR::ImmGenData()
pred <- SingleR(test=obj[["RNA"]]@counts, ref=ref, labels=ref$label.main)

names(pred)
[1] "scores"        "first.labels"  "tuning.scores" "labels"
[5] "pruned.labels"

This was what you mentioned in https://github.com/LTLA/SingleR/issues/26#issuecomment-525008846, which the first cell was labeled accordingly as "Tgd".

The first.labels column holds the labels with the top scores from before fine-tuning was run.

pred$scores[1, which.max(pred$scores[1,])]
Tgd
0.6406436

pred$first.labels[1]
[1] "Tgd"

What about fine-tuning result? Do the following pred$tuning.scores[1,1] (0.35866) corresponds to the score after fine-tuning pred$labels[1] (T cells) ? If so, do the "second" column of pred$tuning.scores show us how similar the cell type could be labeled, e.g. pred$tuning.scores[3, ] indicates the possibility to be labeled as the other cell type in the fine-tuning step?

pred$labels[1]
[1] "T cells"

head(pred$tuning.scores)
DataFrame with 6 rows and 2 columns
              first            second
          <numeric>         <numeric>
1  0.35866196429129 0.278526797377246
2 0.348669685182785 0.245577057489904
3 0.385737579718929 0.350935673578111
4 0.351770799684903 0.293423697297476
5 0.426556808282135 0.357083131625234
6 0.404420979309843 0.337577509812081

dtm2451 commented 4 years ago

Do the following pred$tuning.scores[1,1] (0.35866) corresponds to the score after fine-tuning pred$labels[1] (T cells) ?

Yes. The tuning.scores reflect the scores from the final round of fine-tuning. The "first" column is scoring for the cell type that will be in $labels, and the "second" column is scoring for whatever the next best reference cell type had been. Rows, as elsewhere, are your different cells.

(Neither column is guaranteed to reflect a score for the first.label cell type; that cell type might not be reselected at all after multiple rounds of fine tuning.)

phoebee-h commented 4 years ago

Got it. Thank you all so much and thanks for maintaining the SingleR package.

SingleR-inc / SingleR

Dealing with Batch effect corrected data #129