xuranw / MuSiC

Multi-subject Single Cell Deconvolution
https://github.com/xuranw/MuSiC
GNU General Public License v3.0
220 stars 83 forks source link

music_prop_toast needs a 'markers' argument #108

Open jaclynbeck-sage opened 1 year ago

jaclynbeck-sage commented 1 year ago

In the function music_prop_toast, it calls music_prop several times with argument markers = markers. However markers is never defined in the main music_prop_toast function and it isn't an input argument to the function. So calling music_prop_toast results in the following error:

Error in music_prop(bulk.mtx = bulk.control, sc.sce = sc.sce, clusters = clusters,  : 
  object 'markers' not found
jaclynbeck-sage commented 1 year ago

Ah, same issue applies for cell_size, ct.cov, centered, and normalize. They are sent as arguments to music_prop but are not defined or allowed as arguments to the main music2_prop_toast function.

Sofieagerbaek commented 1 year ago

I am experiencing the same issue, is there a solution? (other than using music_prop() instead)

jaclynbeck-sage commented 1 year ago

Judging by the number of unanswered issues in this repository, I don't think the lab is interested in fixing anything or maintaining this tool. I think the only solution right now is to download the source code and fix it locally. I've stopped trying because of the number of issues that would need fixing before it was functional.

Sofieagerbaek commented 1 year ago

Alright thanks! I did find it hard to navigate this tools using my own data as it is so incoherent - fixing source code seems like too much effort, I'll probably just switch to another deconvolution tool.

shaojunyu commented 1 year ago

Here is my temporary solution, FYI:

Copy the source code of the function from the package and modify it in two places:

  1. argument list, line 498 original code:

    music2_prop_toast = function(bulk.control.mtx, bulk.case.mtx, sc.sce, clusters, samples, select.ct, expr_low=20, prop_r=0.1, 
                        eps_c=0.05, eps_r=0.01, cutoff_c=10^(-3), cutoff_r=10^(-3), cap=0.3, maxiter = 200){

    new code (add markers, ct.cov, cell_size , centered , normalize to the argument list):

    music2_prop_toast <- function (bulk.control.mtx, bulk.case.mtx, sc.sce, clusters, 
          samples, select.ct, expr_low = 20, prop_r = 0.1, eps_c = 0.05, 
          eps_r = 0.01, cutoff_c = 10^(-3), cutoff_r = 10^(-3), cap = 0.3, 
          maxiter = 200, markers = NULL, ct.cov = FALSE, cell_size = NULL,
          centered = FALSE, normalize = FALSE) {
  2. csTest, line 554 original code:

    res_table<- csTest(fitted_model, coef = "group", verbose = F)

    new code (change group to condition, see line 505):

    res_table<- csTest(fitted_model, coef = "condition", verbose = F)

Define a new function in my R script and use the new function:

my_music2_prop_toast <- function (bulk.control.mtx, bulk.case.mtx, sc.sce, clusters, 
          samples, select.ct, expr_low = 20, prop_r = 0.1, eps_c = 0.05, 
          eps_r = 0.01, cutoff_c = 10^(-3), cutoff_r = 10^(-3), cap = 0.3, 
          maxiter = 200, markers = NULL, ct.cov = FALSE, cell_size = NULL,
          centered = FALSE, normalize = FALSE) {
  gene.bulk = intersect(rownames(bulk.control.mtx), rownames(bulk.case.mtx))
  if (length(gene.bulk) < 0.1 * min(nrow(bulk.control.mtx), 
                                    nrow(bulk.case.mtx))) {
    stop("Not enough genes for bulk data! Please check gene annotations.")
  }
  bulk.mtx = cbind(bulk.control.mtx[gene.bulk, ], bulk.case.mtx[gene.bulk, 
  ])
  Pheno = data.frame(condition = factor(c(rep("control", ncol(bulk.control.mtx)), 
                                          rep("case", ncol(bulk.case.mtx))), levels = c("control", 
                                                                                        "case")))
  rownames(Pheno) = colnames(bulk.mtx)
  gene_all = intersect(gene.bulk, rownames(sc.sce))
  if (length(gene_all) < 0.2 * min(length(gene.bulk), nrow(sc.sce))) {
    stop("Not enough genes between bulk and single-cell data! Please check gene annotations.")
  }
  bulk.mtx = bulk.mtx[gene_all, ]
  sc.iter.sce = sc.sce[gene_all, ]
  expr = apply(bulk.mtx, 1, mean)
  exp_genel = names(expr[expr >= expr_low])
  bulk.control = bulk.mtx[, colnames(bulk.control.mtx)]
  bulk.case = bulk.mtx[, colnames(bulk.case.mtx)]
  prop_control = music_prop(bulk.mtx = bulk.control, sc.sce = sc.sce, 
                            clusters = clusters, samples = samples, select.ct = select.ct, 
                            markers = markers, cell_size = cell_size, ct.cov = ct.cov, 
                            iter.max = 1000, nu = 1e-04, eps = 0.01, centered = centered, 
                            normalize = normalize, verbose = F)$Est.prop.weighted
  prop_case_fix = NULL
  prop_case_ini = music_prop(bulk.mtx = bulk.case, sc.sce = sc.sce, 
                             clusters = clusters, samples = samples, select.ct = select.ct, 
                             markers = markers, cell_size = cell_size, ct.cov = ct.cov, 
                             iter.max = 1000, nu = 1e-04, eps = 0.01, centered = centered, 
                             normalize = normalize, verbose = F)$Est.prop.weighted
  prop_CASE = prop_case_ini
  prop_all = rbind(prop_control, prop_CASE)
  iter = 1
  ncell = length(select.ct)
  id_conv = NULL
  while (iter <= maxiter) {
    Y_raw = log1p(bulk.mtx)
    design = Pheno
    Prop <- prop_all[rownames(Pheno), ]
    Design_out <- makeDesign(design, Prop)
    fitted_model <- fitModel(Design_out, Y_raw)
    res_table <- csTest(fitted_model, coef = "condition", verbose = F)
    mex = apply(prop_all, 2, mean)
    lr = NULL
    for (celltype in select.ct) {
      m = mex[celltype]
      DE = res_table[[celltype]]
      pval = DE$fdr
      names(pval) = rownames(DE)
      pval = pval[names(pval) %in% exp_genel]
      if (m >= prop_r) {
        lr = c(lr, names(pval[pval <= cutoff_c & pval <= 
                                quantile(pval, prob = cap)]))
      }
      else {
        lr = c(lr, names(pval[pval <= cutoff_r & pval <= 
                                quantile(pval, prob = cap)]))
      }
    }
    lr = unique(lr)
    l = setdiff(gene_all, lr)
    sc.iter.sce = sc.sce[l, ]
    if (length(id_conv) > 0) {
      case_sample = bulk.case[, !colnames(bulk.case) %in% 
                                id_conv]
    }
    else {
      case_sample = bulk.case
    }
    prop_case = music_prop(bulk.mtx = case_sample, sc.sce = sc.iter.sce, 
                           clusters = clusters, samples = samples, select.ct = select.ct, 
                           verbose = F)$Est.prop.weighted
    prop_CASE = rbind(prop_case, prop_case_fix)
    if (length(id_conv) == 1) {
      rownames(prop_CASE) = c(rownames(prop_case), id_conv)
    }
    prop_all = rbind(prop_control, prop_CASE)
    prop_case = prop_case[rownames(prop_case_ini), ]
    pc = abs(prop_case - prop_case_ini)
    conv = pc
    conv[, ] = 1
    conv[prop_case_ini <= prop_r] = ifelse(pc[prop_case_ini <= 
                                                prop_r] < eps_r, 0, 1)
    pc[prop_case_ini > prop_r] = pc[prop_case_ini > prop_r]/prop_case_ini[prop_case_ini > 
                                                                            prop_r]
    conv[prop_case_ini > prop_r] = ifelse(pc[prop_case_ini > 
                                               prop_r] < eps_c, 0, 1)
    convf = apply(conv, 1, function(x) {
      all(x == 0)
    })
    all_converge = FALSE
    id_conv = c(id_conv, names(convf[convf == TRUE]))
    prop_case_ini = prop_CASE[!rownames(prop_CASE) %in% 
                                id_conv, ]
    prop_case_fix = prop_CASE[rownames(prop_CASE) %in% id_conv, 
    ]
    if (is.vector(prop_case_ini)) {
      all_converge = TRUE
      break
    }
    else if (nrow(prop_case_ini) == 0) {
      all_converge = TRUE
      break
    }
    iter = iter + 1
  }
  if (all_converge) {
    return(list(Est.prop = prop_all, convergence = TRUE, 
                n.iter = iter, DE.genes = lr))
  }
  else {
    return(list(Est.prop = prop_all, convergence = FALSE, 
                id.not.converge = rownames(prop_case_ini)))
  }
}

my_music2_prop_toast(
  bulk.control.mtx = bulk.control.mtx,
  bulk.case.mtx = bulk.case.mtx,
  sc.sce = seger.sce,
  clusters = 'cellType',
  samples = 'sampleID',
  select.ct = c('acinar','alpha','beta','delta','ductal','gamma')
) -> est

R sessionInfo:


R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Rocky Linux 8.7 (Green Obsidian)

Matrix products: default BLAS/LAPACK: /usr/lib64/libopenblasp-r0.3.15.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] SingleCellExperiment_1.20.0 SummarizedExperiment_1.28.0 GenomicRanges_1.50.1 GenomeInfoDb_1.34.3
[5] IRanges_2.32.0 S4Vectors_0.36.0 MatrixGenerics_1.10.0 matrixStats_1.0.0
[9] Biobase_2.58.0 BiocGenerics_0.44.0 MuSiC_1.0.0 TOAST_1.10.1
[13] quadprog_1.5-8 limma_3.54.0 EpiDISH_2.12.0 ggplot2_3.4.2
[17] nnls_1.4

loaded via a namespace (and not attached): [1] Rcpp_1.0.11 lattice_0.21-8 tidyr_1.3.0 corpcor_1.6.10 class_7.3-22
[6] foreach_1.5.2 utf8_1.2.3 R6_2.5.1 plyr_1.8.8 MatrixModels_0.5-2
[11] coda_0.19-4 e1071_1.7-13 pillar_1.9.0 zlibbioc_1.44.0 rlang_1.1.1
[16] rstudioapi_0.15.0 SparseM_1.81 Matrix_1.6-0 splines_4.2.0 stringr_1.5.0
[21] RCurl_1.98-1.12 munsell_0.5.0 proxy_0.4-27 DelayedArray_0.24.0 compiler_4.2.0
[26] pkgconfig_2.0.3 mcmc_0.9-7 tidyselect_1.2.0 tibble_3.2.1 GenomeInfoDbData_1.2.9 [31] codetools_0.2-19 reshape_0.8.9 fansi_1.0.4 withr_2.5.0 dplyr_1.1.2
[36] MASS_7.3-60 bitops_1.0-7 grid_4.2.0 GGally_2.1.2 gtable_0.3.3
[41] lifecycle_1.0.3 magrittr_2.0.3 scales_1.2.1 cli_3.6.1 stringi_1.7.12
[46] XVector_0.38.0 doParallel_1.0.17 generics_0.1.3 vctrs_0.6.3 locfdr_1.1-8
[51] RColorBrewer_1.1-3 iterators_1.0.14 tools_4.2.0 glue_1.6.2 purrr_1.0.1
[56] parallel_4.2.0 survival_3.5-5 colorspace_2.1-0 quantreg_5.96 MCMCpack_1.6-3