Danko-Lab / BayesPrism

A Fully Bayesian Inference of Tumor Microenvironment composition and gene expression
157 stars 46 forks source link

Similar error while running `plot.cor.phi` and `plot.scRNA.outlier` functions ('x' must be an array of at least two dimensions) #49

Closed iamakhilverma closed 1 year ago

iamakhilverma commented 1 year ago

Hello @tinyi and Team, I'm getting similar errors while running plot.cor.phi and plot.scRNA.outlier functions. I've followed the vignette (Tutorial: bulk RNA-seq deconvolution using BayesPrism by Tinyi Chu) step-by-step to generate the input files correctly. Please find the commands, respective errors, and some cells giving information about the input files that I've prepared for BayesPrism.

Can you please look into this and help me resolve it?

plot.cor.phi(input = sc.dat,
            input.labels = cell.state.labels,
            title = "cell state correlation",
            # specify pdf.prefix if need to output to pdf
            # pdf.prefix = "BayesPrism.crc.cor.cs", 
            cexRow = 0.2, cexCol = 0.2, min.exp = 3,
            margins = c(2,2))

Error in h(simpleError(msg, call)): error in evaluating the argument 'j' in selecting a method for function '[': 'x' must be an array of at least two dimensions Traceback:

  1. plot.cor.phi(input = sc.dat, input.labels = cell.state.labels, . title = "cell state correlation", cexRow = 0.2, cexCol = 0.2, . min.exp = 3, margins = c(2, 2))
  2. input[, colSums(input) >= min.exp]
  3. colSums(input)
  4. stop("'x' must be an array of at least two dimensions")
  5. .handleSimpleError(function (cond) . .Internal(C_tryCatchHelper(addr, 1L, cond)), "'x' must be an array of at least two dimensions", . base::quote(colSums(input)))
  6. h(simpleError(msg, call))
sc.stat <- plot.scRNA.outlier(
  input = sc.dat, #make sure the colnames are gene symbol or ENSMEBL ID 
  cell.type.labels = cell.type.labels,
  species = "hs", #currently only human(hs) and mouse(mm) annotations are supported
  return.raw = TRUE #return the data used for plotting. 
  # pdf.prefix = "BayesPrism.crc.sc.stat" # specify pdf.prefix if need to output to pdf
)

Error in colSums(ref[labels == label.i, , drop = F]): 'x' must be an array of at least two dimensions Traceback:

  1. plot.scRNA.outlier(input = sc.dat, cell.type.labels = cell.type.labels, . species = "hs", return.raw = TRUE)
  2. collapse(ref = input, labels = cell.type.labels)
  3. do.call(rbind, lapply(labels.uniq, function(label.i) colSums(ref[labels == . label.i, , drop = F])))
  4. lapply(labels.uniq, function(label.i) colSums(ref[labels == label.i, . , drop = F]))
  5. FUN(X[[i]], ...)
  6. colSums(ref[labels == label.i, , drop = F])
  7. stop("'x' must be an array of at least two dimensions")

For reference:

class(bk.dat)
class(sc.dat)
class(cell.type.labels)
class(cell.state.labels)

'matrix''array' 'dgCMatrix' 'character' 'character'

dim(bk.dat)
dim(sc.dat)
length(cell.type.labels)
length(cell.state.labels)

59218184 61015018184 610150 610150

head(bk.dat)
head(sc.dat)
head(cell.type.labels)
head(cell.state.labels)
  | RNU12-2P | EFCAB8 | TRIM75P | GTPBP6 | EFCAB12 | A1BG | A1CF | A2M | A2ML1 | A4GALT | ... | ZWILCH | ZWINT | ZXDA | ZXDB | ZXDC | ZYG11A | ZYG11B | ZYX | ZZEF1 | ZZZ3 -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- TCGA.3L.AA1B.01 | 1.9342 | 2.4178 | 0.4836 | 1033.850 | 0.0000 | 22.1470 | 220.987 | 15911.50 | 0.4836 | 118.9560 | ... | 403.641 | 629.594 | 71.0832 | 461.315 | 1105.42 | 3.3849 | 543.037 | 6259.19 | 1358.32 | 798.356 TCGA.4N.A93T.01 | 0.4838 | 2.4190 | 0.0000 | 1817.610 | 1.4514 | 171.2680 | 100.629 | 1494.33 | 0.4838 | 22.2545 | ... | 186.686 | 442.187 | 39.6710 | 366.715 | 1149.49 | 0.4838 | 290.760 | 4653.12 | 1220.13 | 333.817 TCGA.4T.AA8H.01 | 2.9245 | 2.9245 | 0.0000 | 719.430 | 0.7311 | 20.9980 | 174.008 | 1333.57 | 36.5564 | 16.0848 | ... | 520.782 | 1033.080 | 31.4385 | 349.479 | 1083.53 | 0.0000 | 669.713 | 4460.61 | 3002.01 | 530.068 TCGA.5M.AAT4.01 | 2.1515 | 2.1515 | 0.8606 | 879.948 | 1.7212 | 6.4587 | 151.463 | 2424.26 | 6.8847 | 75.7315 | ... | 468.408 | 1629.090 | 54.6472 | 542.169 | 1374.35 | 0.4303 | 445.353 | 4190.19 | 1093.37 | 574.441 TCGA.5M.AAT5.01 | 0.9892 | 8.9030 | 0.0000 | 934.819 | 1.4838 | 14.8384 | 255.715 | 2398.34 | 0.9892 | 41.5475 | ... | 663.533 | 838.864 | 29.1822 | 428.335 | 1240.98 | 3.4623 | 550.504 | 3878.26 | 1016.43 | 413.002 TCGA.5M.AAT6.01 | 1.3125 | 4.5937 | 0.0000 | 605.049 | 3.9374 | 49.8017 | 0.000 | 7231.65 | 2.6249 | 161.4340 | ... | 600.771 | 1338.720 | 45.9365 | 335.337 | 1056.54 | 13.7810 | 492.833 | 6165.99 | 1390.56 | 717.266
  [[ suppressing 34 column names 'OR4F5', 'OR4F29', 'FAM41C' ... ]]

6 x 18184 sparse Matrix of class "dgCMatrix"

cell1 . . . . . . . . 2 . . . .  1 3 1 . . . . . . . . . 4 1
cell2 . . . . . . . . 2 1 . . .  . . . . . . . . 1 . . . 1 .
cell3 . . . . . . . . 1 1 . . .  . . . . 1 . . . . . . . . .
cell4 . . . . 1 1 . 2 9 2 . . . 13 5 . . . . . . . . . . 3 1
cell5 . . . . . . . . . . . . .  . . . . . . . . . . . . . 1
cell6 . . . . . . . . 2 . . . .  1 2 1 . 2 . . . . . 1 1 5 1

cell7 2 . . . . . . ......
cell8 . . . 4 . . . ......
cell9 . . . 1 . . . ......
cell10 2 . . 8 1 . . ......
cell11 2 . . . . . . ......
cell12 . . . 4 . . . ......

 .....suppressing 18150 columns in show(); maybe adjust 'options(max.print= *, width = *)'
 ..............................
  1. 'Endothelial'
  2. 'Endothelial'
  3. 'Endothelial'
  4. 'Endothelial'
  5. 'Endothelial'
  6. 'Endothelial'
  1. 'Endothelial'
  2. 'Endothelial'
  3. 'Endothelial'
  4. 'Endothelial'
  5. 'Endothelial'
  6. 'Endothelial'
iamakhilverma commented 1 year ago

So, my labmate figured it out. Turns out, BayesPrism pipeline requires sc.dat to be a full matrix and not the compressed sparse one (dgcMatrix). sc.dat <- as.matrix(sc.dat) solved it for me.

tinyi commented 1 year ago

Thanks for updating this.