dviraran / SingleR

SingleR: Single-cell RNA-seq cell types Recognition (legacy version)
GNU General Public License v3.0
263 stars 98 forks source link

Functions in SingleR from bioconductor #111

Open hfberg opened 4 years ago

hfberg commented 4 years ago

Hi! I just upgraded to SingleR from bioconductor. The new code is very efficient, well done! However, I'm missing some parts from the old SingleR version and I haven't been able to find information of this in the vignette or among the functions in the new package.

> class(pred.main)
[1] "DFrame"
attr(,"package")
[1] "S4Vectors"

I think that's all for now, thanks again.

dviraran commented 4 years ago

Hi,

Yes, currently the browser doesn't function with the new object. I can't find the time to do that currently (if anyone wants to take up on it, that would be great). SingleR now works as an add-on to any of your favorite scRNA-seq analytics and visualization tools (Seurat, scater, etc.).

Regarding creating tSNE plots - @dtm2451, can you share some code using your package?

Best, Dvir

dtm2451 commented 4 years ago

Hello @hfberg,

Here is some code that should get you the tsne you would like:

# Install dittoSeq and load:
devtools::install_github("dtm2451/dittoSeq@development")
library(dittoSeq)

## If you ran SingleR with this code:
# If you had a Seurat
pred <- SingleR(test = as.SingleCellObject(mydata), ...)
# If you had a SingleCellExperiment
pred <- SingleR(test = mydata, ...)

# Either way, with dittoSeq, you can make a tSNE plot with your data and labels using:
dittoDimPlot(pred$labels, object = mydata)

# Bonus: for showing pruned labels, I recommend using:
dittoDimPlot(pred$labels, object = mydata, cells.use = !is.na(pred$pruned.labels))

During the Bioconductor update planning process, we decided that rather than creating an internal tSNE function, which would need to be maintained separately, we would outsource that to another package. Namely, we plan to utilize my own upcoming dittoSeq package, which is called in the code above.

Hope this helps! Dan

hfberg commented 4 years ago

ok, thanks guys, I'll have a look at the dittoSeq package. :) Would there be any reason to refrain from adding the "labels" or "pruned.labels" from SingleR to the active.ident in Seurat? The tSNE is then plotted with the new identities with DimPlot. I got the plot using this code, but will there, for example, be any issues if the number of pruned labels are fewer than the number of cells in the Seurat object?

Thanks again for the help! :)

`

seurat@active.ident<-as.factor(singler.main@listData[["pruned.labels"]]) names(seurat@active.ident)<-colnames(seurat@assays[["RNA"]]@data) head(seurat@active.ident) TCTACCCCATCTGTAATG TCTACCACACCCGCCCTC CTCGCAAACCTAAAAGTT CCATCTAACCTAAAAACG Mast cells Fibroblasts Macrophages Fibroblasts TCTACCCAAAGTTAGCAT TCTACCATCTCTCTTCTG Macrophages T cells 14 Levels: Basophils DC Epithelial cells Fibroblasts ILC Macrophages ... Tgd

DimPlot(object = seurat, reduction = "tsne") `

dtm2451 commented 4 years ago

I recommend adding them as metadata instead because then it's almost as easy to plot, and there are unlimited numbers of slots:

seurat_obj$main.pruned.labels <- singler.main$pruned.labels
seurat_obj$main.labels <- singler.main$labels

DimPlot("main.pruned.labels", object = seurat, reduction = "tsne")
DimPlot("main.labels", object = seurat, reduction = "tsne")

Also, I'm not sure how Seurat's plotters handle NAs. (That's what gets placed instead of the label for any pruned calls.) So I can't say for sure, but you can run this to find out if those cells just don't get plotted:

noNAs <- as.character(Idents(seurat))
noNAs[is.na(charNAs)] <- "pruned"
seurat_obj$main.pruned.labels.noNAs <- as.factor(noNAs)
DimPlot("main.pruned.labels.noNAs", object = seurat, reduction = "tsne")

^^^ If you see a new group called "pruned" that didn't show up before, Seurat doesn't plot NAs, but doesn't throw a warning about that either.

Also, one extra suggestion while I'm here: While I want to commend you for picking up the raw structures of these objects. You're making many direct structural calls here when, Idents(seurat) <- as.factor(singler.main$pruned.labels) would get the job done, AND is less likely to break in the future. Future compatibility is the reason that the use of getters and setter functions is highly recommended best practice. Seurat is out of the norm in that they don't seem to care to make things work the same way between version updates. S4Vectors, however, is not like that, and same for the BioC SingleR. Use of singler.main$pruned.labels is more likely to retain the same result in all future versions than your direct access method.

On the dittoSeq side:

I may be biased lol but I think my plots are prettier by default and easier to modify. They definitely throw a warning (but keep going) for NAs/missing data. Also, they default to using "tsne", and they grab clustering from Seurats when you give "ident" just like, I think, Seurat plotters do. So, to make your life easy when you do try dittoSeq: your code would be dittoDimPlot("ident", seurat) if you keep the data stored in the clustering slot. And if you used the raw pruned.labels with NAs but wanted the NA cells to show up as grayed out background dots, dittoDimPlot("ident", seurat, cells.use = !is.na(meta("ident",seurat))