satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.26k stars 909 forks source link

Regressing out nCountRNA in scRNA before integrating with scATAC #3308

Closed RegnerM2015 closed 4 years ago

RegnerM2015 commented 4 years ago

This is a follow up from issue #2443.

I am working with matched scRNA-seq/scATAC-seq data. I want to regress out the nCountRNA variable only in the scRNA data before FindTransferAnchors() which will transfer the scRNA labels onto the scATAC cells. However, in the source code for FindTransferAnchors() the method of ScaleData is agnostic to any regression variable:

## find anchors using CCA
  if (reduction == 'cca') {
    if (normalization.method == "LogNormalize") {
      reference <- ScaleData(object = reference, features = features, verbose = FALSE)
      query <- ScaleData(object = query, features = features, verbose = FALSE)
    }
    combined.ob <- RunCCA(
      object1 = reference,
      object2 = query,
      features = features,
      num.cc = max(dims),
      renormalize = FALSE,
      rescale = FALSE,
      verbose = verbose
    )
  }

Would it be valid to modify the FindTransferAnchors() function to include this regression?:

## find anchors using CCA
  if (reduction == 'cca') {
    if (normalization.method == "LogNormalize") {
      reference <- ScaleData(object = reference, features = features, verbose = FALSE,vars.to.regress = "nCountRNA")
      query <- ScaleData(object = query, features = features, verbose = FALSE)
    }
    combined.ob <- RunCCA(
      object1 = reference,
      object2 = query,
      features = features,
      num.cc = max(dims),
      renormalize = FALSE,
      rescale = FALSE,
      verbose = verbose
    )
  }

Thanks in advance!

timoast commented 4 years ago

I tested this in a couple of cases and it tends to do slightly worse than the standard approach, so I wouldn't really recommend it. You can certainly try with your own data, but we probably won't add this as an option in the Seurat codebase.