satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.26k stars 904 forks source link

How to use external variable in a predicate expression for WhichCells() ? #7358

Closed timothevanmeter closed 1 year ago

timothevanmeter commented 1 year ago

Hello,

I am trying to subset cells that are expressing any pairs of genetic markers that are from two distinct macro-types of cells.

I have for different cellular macro-types a list of corresponding genetic markers that identify the macro-types. What I want is to exclude all cells that express markers identifying two or more distinct macro-types.

I wanted to use WhichCells() to subset these cells, as I have the names of the markers in a list which I can access in the following way:

str(markers.nathans) markers.nathans[[1]][1]

With the following output:

List of 11 $ amacrine_cells : chr [1:8] "Gad1" "Slc6a9" "Stx1b" "Calb2" ... $ astrocytes : chr [1:4] "Pax2" "Gfap" "Vim" "Glul" $ cone_bipolar_cells : chr [1:9] "Scgn" "Grik1" "Vsx1" "Lhx4" ... $ cones : chr [1:5] "Opn1sw" "Opn1mw" "Gnat2" "Arr3" ... $ horizontal_cells : chr [1:4] "Lhx1" "Calb1" "Gja10" "Onecut1" $ muller_glia : chr [1:9] "Slc1a3" "Apoe" "Dkk3" "Gpr37" ... $ perivascular_cells : chr [1:6] "Myl9" "Cspg4" "Pdgfrb" "Myh11" ... $ retinal_ganglion_cells : chr [1:4] "Sncg" "Pou4f1" "Nrn1" "Slc17a6" $ rod_bipolar_cells : chr [1:7] "Trpm1" "Grm6" "Sebox" "Prkca" ... $ rods : chr [1:5] "Rho" "Gnat1" "Cnga1" "Nrl" ... $ vascular_endothelial_cells: chr [1:8] "Cldn5" "Cdh5" "Pecam1" "Vwf" ...

'Gad1'

However, the following command: length(WhichCells(rfull.combined, expression = markers.nathans[[1]][1] > 0, slot = 'counts')) returns Error in FetchData.Seurat(object = object, vars = unique(x = expr.char[vars.use]), : None of the requested variables were found:

As, the same command with the marker's name written directly without quotes, length(WhichCells(rfull.combined, expression = Gad1 > 0, slot = 'counts')) correctly returns the number of cells expressing Gad1 2371

I tried a couple things hoping to match the expected format for expression = in WhichCells(), but without any luck so far. Is there any way to make this work ?

I have a workaround that is quite lengthy: length( rfull.combined@assays$RNA@data[ rownames(rfull.combined@assays$RNA@data) == markers.nathans[[1]][1],rfull.combined@assays$RNA@data[rownames(rfull.combined@assays$RNA@data) == markers.nathans[[1]][1],] > 0 ] )

I would really prefer to solve the previous instead of using the ugly line above ... Thank you in advance for any help or advice you can provide.

mhkowalski commented 1 year ago

Hi,

You can use something like this

> length(WhichCells(obj, expression = Klf1 > 0))
[1] 515
> t = "Klf1"
> sum(GetAssayData(obj, slot="data", assay="RNA")[t,]>0)
[1] 515
#to get barcodes
length(Cells(obj)[GetAssayData(obj, slot="data", assay="RNA")[t,]>0]) 
[1] 515