carmonalab / ProjecTILs

Interpretation of cell states using reference single-cell maps
GNU General Public License v3.0
246 stars 28 forks source link

predictTilState ERROR #4

Closed france-hub closed 3 years ago

france-hub commented 3 years ago

Hi!

Thanks for your package.

I am trying to follow the following case study https://carmonalab.github.io/ProjecTILs_CaseStudies/SadeFeldman_ortho.html using my scRNAseq data (CD3+ T cells).

However, when I run make.projection on my Seurat object I get the following error:

Error in predictTilState(sce, human = human) : Too many genes not found In addition: Warning message: In predictTilState(sce, human = human) : The following genes were not found in the dataset provided CD2,CD3D,CD3E,CD247,LCK,CD8B,CD8A,CD4,...

When I run: c("CD3D", "CD3E", "CD247", "LCK") %in% row.names(sobj) TRUE TRUE TRUE TRUE

sobj here is my seurat object.

I noticed that in the query_example_seurat the gene name format is with lower cases (eg "Cd3d"). However even if I change the format of my gene names I get the same error.

Could you please help me?

Thanks a lot

Francesco

mass-a commented 3 years ago

Hello Francesco,

by the name of your genes I guess you have human data. Can you confirm that you are running make.projection with the human.ortho=TRUE option?

If that is the case, in which assay of the seurat object are the data stored? and do you have a "data" or "counts" slot for that assay?

france-hub commented 3 years ago

Thank you for your reply.

Yes, I am running query.projected <- make.projection(sobj, ref = ref, human.ortho = T)

The assay is RNA and I have "data" and "counts" slots in that assay.

Thank you again Francesco

mass-a commented 3 years ago

It seems that TILPRED doesn't get the data in the right format, for some reason.

Are the data correctly converted to the sce format if your run the following commands?

dim(sobj@assays$RNA@counts)
dim(sobj@assays$RNA@data)
sce <- as.SingleCellExperiment(sobj)
dim(sce)
c("CD3D", "CD3E", "CD247", "LCK") %in% row.names(sce)

and then

sce.pred <- predictTilState(sce, human=TRUE)

What version of TILPRED do you have?

packageVersion("TILPRED")
france-hub commented 3 years ago

This is the output:

`> dim(sobj@assays$RNA@counts) [1] 8781 26592

dim(sobj@assays$RNA@data) [1] 8781 26592 sce <- as.SingleCellExperiment(sobj) dim(sce) [1] 8781 26592 c("CD3D", "CD3E", "CD247", "LCK") %in% row.names(sce) [1] TRUE TRUE TRUE TRUE sce.pred <- predictTilState(sce, human=TRUE) Error in predictTilState(sce, human = TRUE) : Too many genes not found In addition: Warning message: In predictTilState(sce, human = TRUE) : The following genes were not found in the dataset provided CD19,NAPSB,CD22,NCF1C,IGLL5,VPREB3,ADAM28,CIITA,FCRL1,HLA-DOB,NCF1B,LY86,ELK2AP,CR2,PKIG,CYBASC3,IGJ,MFAP5,FBLN1,COL1A2,BGN,DCN,COL1A1,COL3A1,SPARC,C1S,LUM,MMP2,THY1,PCOLCE,PMP22,SFRP4,MGP,FSTL1,C3,C1R,SERPING1,EFEMP1,FN1,NNMT,FBLN2,CTSK,DPT,SFRP2,CXCL12,PPIC,CXCL14,RARRES2,MFAP4,WISP2,COMP,CTHRC1,CTGF,PLA2G2A,COL8A1,TAGLN,POSTN,PTRF,GNG11,CALD1,S100A16,CNN3,TM4SF1,A2M,EGFL7,HYAL2,RAMP2,TFPI,CYR61,RAMP3,AQP1,SPARCL1,C10orf10,RNASE1,FABP4,CCL14,DARC,CCL21,CD14,FAM26F,TMEM176B,CPVL,GPX1,CTSL1,C1QA,C1QB,C1QC,VSIG4,APOC1,SPP1,APOE,IL8,SEPP1,CXCL10,S100A13,PLP1,MLANA,PRAME,TYR,APOD,PMEL,MIA,QPCT,GDF15,DCT,APOC2,SERPINA3,TYRP1 . Doesn't look too bad but prediction performance might be affected. packageVersion("TILPRED") [1] ‘1.0.1’`

mass-a commented 3 years ago

Ok I see now. Too many genes are missing and TILPRED fails to score its signatures.

I think you have two options: 1) either you include more genes from the expression matrix (currently you have 8781 genes, I guess some were filtered out?)

2) or, if you are sure that you have only T cells (you mentioned your data are CD3+), you could simply run ProjecTILs without the TILPRED signature filter:

query.projected <- make.projection(sobj, ref = ref, human.ortho = T, filter.cells=FALSE)

I hope this helps -massimo

france-hub commented 3 years ago

Thank you! I have tried your second option, but it seems that still too many genes are missing:

query.projected` <- make.projection(sobj, ref = ref, human.ortho = T, filter.cells = FALSE, skip.normalize = TRUE)
[1] "Using assay integrated for query"
[1] "Transforming expression matrix into space of mouse orthologs"
Error in make.projection(sobj, ref = ref, human.ortho = T, filter.cells = FALSE,  : 
  Too many genes missing. Check input object format

Thank you for your help and patience Francesco

mass-a commented 3 years ago

From your output, I see your are using the "integrated" assay for the query.

Is that what you want, and in that case how many genes do you have in that assay? dim(sobj@assays$integrated@data)

Otherwise you should switch to the "RNA" assay, as seemed to be your intention from a previous comment.

france-hub commented 3 years ago

Thank you very much for your suggestions. Apologize for the delayed reply

Francesco