Neurosurgery-Brain-Tumor-Center-DiazLab / CONICS

CONICS: COpy-Number analysis In single-Cell RNA-Sequencing
73 stars 28 forks source link

Questions about the input expression matrix and threshold used in plotChromosomeHeatmap #13

Open YiweiNiu opened 5 years ago

YiweiNiu commented 5 years ago

Hi,

Thank you for this cool tool!

I have three questions in using CONICSmat.

  1. Is the input matrix must be log2(CPM/10+1) normalized expression counts? I tried the normalized data from Seurat and log2(CPM/10+1) count, and got different results.
  2. In the three tutorials provided, the chromosome_full_positions_grch38.txt was used in the first two, while chromosome_arm_positions_grch38.txt was used in the third one. I guess it is trivial to choose this region file and CONICSmat can estimate on any regions, is that true? I saw that you discussed this in this thread, so, I am confused.
  3. The third question relates to the thresh parameter of plotChromosomeHeatmap, I tried different ones and got distinct figures. I wonder what is the meaning of this parameter, and how to set this?

Looking forward to your reply. Any comments or suggestions would be highly appreciated.

Bests, Yiwei Niu

biobug16 commented 4 years ago

Hi YiweiNiu, I also want to use the read count matrix generated by Seurat which is actually in natural log scale but do not know how to convert these to log2(CPM/10+1) for the downstream analysis by CONICSmat. Can you please guide and explain how you did the same for your data?

YiweiNiu commented 4 years ago

Hi @biobug16

I used the following code to get log2(CPM/10+1) count from Seurat object

expr = CONICSmat::normMat(as.matrix(Seurat::GetAssayData(seurat_obj, assay = "RNA", slot = "counts")))

It was done by extracting the raw counts matrix from seurat object then using the normMat from CONICSmat. You can also do this by yourself, since the method behind normMat is rather simple. I paste the code of this function below.

#' Normalize a expression matrix of raw counts to log2 (CPM/10+1) values
#'
#' This function normlizes a matrix of raw gene counts to log2(CPM/10+1).
#' @param expmat A genes X samples expression matrix of raw (single cell) RNA-seq counts.
#' @keywords Normalize
#' @export
#' @examples
#' normMat(suva_exp)

normMat = function (expmat){
  sdepth=colSums(expmat)
  rmat=t( t(expmat) / sdepth*1000000 )
  rmat=log2(rmat/10+1)
  return(rmat)
}

Bests

biobug16 commented 4 years ago

Thanks YiweiNiu, for your response. it worked.