maqin2001 / qubic-r-package

Other
0 stars 0 forks source link

Step 2.2 sc_cell (MCL clustering and cell classification) #7

Open PegasusAM opened 6 years ago

PegasusAM commented 6 years ago
  1. input graph matrix, compare matrix colname with all cell name, the "all cell name" can be the colnames of expression file.
  2. save the uncovered cell in a list. my code: mis_cell <- unique(cell_names[! cell_names %in% mat_cell])
  3. do MCL. For now, let's set inflation=100, max.iter=100, addloops=FALSE, and others default. The output includes (1) K, cluster numbers, (2) n.iterations and (3) cluster tags.
  4. extract cluster and assign to each cell
  5. add any uncovered cell after the result and assign a tag "ungrouped"
  6. export data frame I've written the code below. Dr. Zhang please check if any logical error exists. Thanks!

sc_cell <- function(res, allcell = NULL) { clust <- list() clust <- mcl(res,addLoops = FALSE,inflation =100,max.iter=100) # MCL clustering cell_type <- data.frame(Cell_name = colnames(res), Cluster = clust$Cluster) # generate cell type list

if (!is.null(allcell)){ mis_cell <- unique(cell_names[! cell_names %in% mat_cell]) # comparation for finding uncovered cells uncover <- data.frame(Cell_name = mis_cell, Cluster = rep("Ungrouped", length(mis_cell))) # assign a tag to uncovered cells cell_type <- rbind(cell_type,uncover) } return(cell_type) }

PegasusAM commented 6 years ago

If we use the hashset, we can replace step 1 and 2 to group missing numbers.

PegasusAM commented 6 years ago

the output file is a two-column matrix, with Cell_names and Cell_clusters

PegasusAM commented 6 years ago

MCL测试:Target expression file: 1000 genes X 50 cells Biclustering: q=0.06, f=0.8, c=0.9,o=5, k=13 MCL: i=30 得到4个biclusters,covering37个cells MCL_1直接输入scgraph的matrix,即50X50 matrix 其中17个uncovered cell是全0行列 MCL_2输入移除了全0行列的matrix,即37X37

image

结果如上。两种均分出6类,但MCL-2的结果还需加上uncover的17个cell,即为7类。结果是不一致的。

PegasusAM commented 6 years ago
  1. 所以我们需要在进入MCL前去掉全0行列,并在MCl结束后加回来。

  2. 输出的结果我们需要提取clust$cluster作为第一列,输入MCL的colnames作为第二列(这就要求输入MCL的矩阵必须要带有行列名)。接着添加uncovered cell的名字(第二列)并赋予标签(第一列)

  3. 需要注意的是,MCL的输出cluster标签并不是连续的且从0开始,所以可以找到现有标签的最大值,在此基础上+1作为在最后添加uncovered cell的标签。

maqin2001 commented 6 years ago

好的,这样的话,我们就应该先去掉全0的cell,MCL之后再加回来。请张禹处理。

Qin Ma, Ph.D. Assistant Professor Department of Plant Science Department of Mathematics and Statistics 254D Northern Plains Biostress lab (SNP) South Dakota State University Brookings, SD, 57007 Lab: http://bmbl.sdstate.edu

2018-05-06 8:18 GMT-05:00 Anjun Ma notifications@github.com:

1.

所以我们需要在进入MCL前去掉全0行列,并在MCl结束后加回来。 2.

输出的结果我们需要提取clust$cluster作为第一列,输入MCL的colnames作为第二列( 这就要求输入MCL的矩阵必须要带有行列名)。接着添加uncovered cell的名字(第二列)并赋予标签(第一列) 3.

需要注意的是,MCL的输出cluster标签并不是连续的且从0开始,所以可以找到现有标签的最大值,在此基础上+1作为在最后添加uncovered cell的标签。

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/maqin2001/qubic-r-package/issues/7#issuecomment-386878859, or mute the thread https://github.com/notifications/unsubscribe-auth/ABarDU7Rtj-HdmgFKtg4nQoB8-JVuTWcks5tvvgagaJpZM4TZbVd .