soumelis-lab / ICELLNET

18 stars 8 forks source link

About using 10X Visium data #10

Closed Pedramto89 closed 1 year ago

Pedramto89 commented 1 year ago

I am trying to use Visium data. However, it seems ICELLNET do not support Visium RDS data as it shows an error when I am trying to implement this code: average.clean= sc.data.cleaning(object = seurat, db = db, filter.perc = filter.perc, save_file = T, path="path/", force.file = F)

This is the error: "Error: Cannot find 'RNA' in this Seurat object"

In the RDS file saved from processed seurat object from visium, there are two data on assays: Spatial and SCT and there is no RNA. Do you have any solution?

lmassenet-regad commented 1 year ago

Dear Pedramto89, Thank you for raising this issue, we will correct the package in the coming days to be compatible with spatial transcriptomics Seurat object. In the mean time, you can adapt the sc.data.cleaning function below , by replacing 'RNA' assay by 'SCT' or 'Spatial' (line 3).


sc.data.cleaning <- function (object=object, db = db, filter.perc=NULL, save_file=T, path=NULL, force.file=FALSE) { if (!file.exists(paste0(path, "scRNAseq_statsInfo_for_ICELLNET.csv")) | force.file==T){ data=object[['SCT']]@ data
int = dplyr::intersect(as.matrix(rownames(data)), as.matrix(db[, 1:5]))

restrict only to gene restricted to matrix

data=data[which(rownames(data)%in%int),]
data.int=as.data.frame(expand.grid(rownames(data), unique(Idents(object)))) #create intermediate data frame
colnames(data.int)=c("Symbol","Cell_ID")
data.int$Perc_posCell=NA
data.int$Mean_exp=NA
#fill in the intermediate dataframe
print("Filling in intermediate table: percentage of expressing cell per cluster per gene, and mean of expression")
for (i in seq(1,dim(data.int)[1])){
  data.int[i,3:4]=Perc_exp_infos(object = object, gene=data.int[i,1], cell_id = data.int[i,2])
}
#Save matrix and histogram of Perc_pos_cell
if (save_file==TRUE){
  if (is.null(path)){
    path=getwd()
    print("Intermediate table were saved in the working directory as scRNAseq_statsInfo_for_ICELLNET.csv. Set path parameter to change the directory to save files.")
  }
  write.csv(data.int, file = paste0(path, "scRNAseq_statsInfo_for_ICELLNET.csv"))
  hist(data.int[,"Perc_posCell"])
  pdf(file = paste0(path, "scRNAseq_statsInfo_Perc_posCell_", Sys.Date(),".pdf"), width = 5, height = 5)
  hist(data.int[,"Perc_posCell"], 100)
  print("Intermediate table were saved as scRNAseq_statsInfo_for_ICELLNET.csv.")
  dev.off()
}

}else{ note( paste0("Following file used as intermediate statistics table: ", path, "scRNAseq_statsInfo_for_ICELLNET.csv. Use force.file=T to regenerate this file")) data.int=utils::read.csv(paste0(path,"scRNAseq_statsInfo_for_ICELLNET.csv" ), header = T) data.int=data.int[,-1] }

Filter expression above cell percentage value

if (!is.null(filter.perc)){ data.int=dplyr::filter(data.int, Perc_posCell > filter.perc/100) print("Filtering done") } average.cluster=reshape2::dcast(data.int[,-3], formula = Symbol~ Cell_ID, value.var = "Mean_exp", drop = T) rownames(average.cluster)=average.cluster$Symbol average.cluster[is.na(average.cluster)]<-0 return (average.cluster) }

Pedramto89 commented 1 year ago

Thank you for your prompt response. So, regarding the scripts you sent, I have some questions especially about the parameters:

In data=seurat[['SCT']]@DaTa, the DaTa equal to what part of the seurat obj? My Seurat object in my scripts is "seurat" and I replaced the object within the scripts you sent with "seurat". But I don't know still what is DaTa.

This is the function I put:

sc.data.cleaning <- function (object=seurat, db = db, filter.perc=NULL, save_file=T, path=NULL, force.file=FALSE) { if (!file.exists(paste0(path, "scRNAseq_statsInfo_for_ICELLNET.csv")) | force.file==T){ data=seurat[['SCT']]@data int = dplyr::intersect(as.matrix(rownames(data)), as.matrix(db[, 1:5]))

restrict only to gene restricted to matrix

data=data[which(rownames(data)%in%int),]
data.int=as.data.frame(expand.grid(rownames(data), unique(Idents(seurat)))) #create intermediate data frame
colnames(data.int)=c("Symbol","Cell_ID")
data.int$Perc_posCell=NA
data.int$Mean_exp=NA
#fill in the intermediate dataframe
print("Filling in intermediate table: percentage of expressing cell per cluster per gene, and mean of expression")
for (i in seq(1,dim(data.int)[1])){
  data.int[i,3:4]=Perc_exp_infos(object = seurat, gene=data.int[i,1], cell_id = data.int[i,2])
}
#Save matrix and histogram of Perc_pos_cell
if (save_file==TRUE){
  if (is.null(path)){
    path=getwd()
    print("Intermediate table were saved in the working directory as scRNAseq_statsInfo_for_ICELLNET.csv. Set path parameter to change the directory to save files.")
  }
  write.csv(data.int, file = paste0(path, "scRNAseq_statsInfo_for_ICELLNET.csv"))
  hist(data.int[,"Perc_posCell"])
  pdf(file = paste0(path, "scRNAseq_statsInfo_Perc_posCell_", Sys.Date(),".pdf"), width = 5, height = 5)
  hist(data.int[,"Perc_posCell"], 100)
  print("Intermediate table were saved as scRNAseq_statsInfo_for_ICELLNET.csv.")
  dev.off()
}

}else{ note( paste0("Following file used as intermediate statistics table: ", path, "scRNAseq_statsInfo_for_ICELLNET.csv. Use force.file=T to regenerate this file")) data.int=utils::read.csv(paste0(path,"scRNAseq_statsInfo_for_ICELLNET.csv" ), header = T) data.int=data.int[,-1] }

Filter expression above cell percentage value

if (!is.null(filter.perc)){ data.int=dplyr::filter(data.int, Perc_posCell > filter.perc/100) print("Filtering done") } average.cluster=reshape2::dcast(data.int[,-3], formula = Symbol~ Cell_ID, value.var = "Mean_exp", drop = T) rownames(average.cluster)=average.cluster$Symbol average.cluster[is.na(average.cluster)]<-0 return (average.cluster) }

And I get this error: `> average.clean = sc.data.cleaning(object = seurat, db = db, filter.perc = filter.perc, save_file = T, path="path/", force.file = F)

[1] "Filling in intermediate table: percentage of expressing cell per cluster per gene, and mean of expression" Error: Cannot find 'RNA' in this Seurat object Called from: [[.Seurat(object, "RNA") Browse[1]> `

lmassenet-regad commented 1 year ago

Dear Pedramto89,

I upgraded the package in order to allow the use of SCT or Spatial assay (I would recommend SCT to use the normalised data) on ICELLNET. What you would need to do : 1- reinstall the package with the latest version 2- precise assay="SCT" in the sc.data.cleaning function, as following :

average.clean= sc.data.cleaning(object = seurat, assay="SCT", db = db, filter.perc = filter.perc, save_file = T, path="path/", force.file = F)

Let me know if you have other issues. Best,

Pedramto89 commented 1 year ago

Thank you @lmassenet-regad It worked! Some to interpret the results, I am wondering what does the score and/or value mean on the heatmap? Also, on the dot plot, only ECM and growth factor is shown. Can we add more to this dot plot? Related to this, on the stacked bar chart, there are a limited couple of molecular families including "Notch family" and "ECM". How ca we add more? I mean is there a list of available families in the ICELLNET database to add on the bar chart?

lmassenet-regad commented 1 year ago

You can find these information on the vignette or in the publication. Briefly: