statOmics / tradeSeq

TRAjectory-based Differential Expression analysis for SEQuencing data
Other
235 stars 27 forks source link

Error: value for ‘L’ not found #201

Closed AnjaliC4 closed 1 year ago

AnjaliC4 commented 2 years ago

Hi , thanks for this awesome tool!

I have one lineage and wanted to compute associationTest on 4 conditions (maleCase, femaleCase, maleControl, femaleControl). I don't get this error with just two conditions but with 4 I do get it. I have tried increasing the number of genes in fitGAM and setting no logFC threshold in association test, yet the issue exists. Is it not possible to test 4 conditions in associationTest or conditionTest? I basically wanted to test with sex*disease interactions exist along the lineage.

colData(se)$group <- paste(colData(se)$sex, colData(se)$condition, sep=".") sce <- fitGAM(counts = counts, pseudotime = pseudotime[,1], cellWeights = rep(1,nrow(pseudotime)),conditions = factor(colData(se)$group),U = U_model,nknots =5,genes=1:100) PTassocRes <- associationTest(sce, lineages = TRUE)

Thanks for your help!

koenvandenberge commented 2 years ago

Hi @AnjaliC4,

Thank you for reporting. I have not worked with 4 conditions yet, although I am sure the conditionTest works with three conditions since we did that in the condiments paper, so I would think that 4 conditions would work just as well. Can you confirm that the conditionTest works as expected?

For the associationTest, are you getting the same error when setting lineages=FALSE?

koenvandenberge commented 2 years ago

I just tried this on a dataset and can confirm that fitGAM, conditionTest and associationTest should work with four conditions, also when setting lineages=TRUE in conditionTest and associationTest, so I am unable to reproduce this issue.

koenvandenberge commented 1 year ago

Closing due to inactivity, feel free to reopen if needed.

cgoneill commented 1 year ago

I ran into the same error with an example dataset (data from Satpathy and Granja et al., 2019; preprocessed according to this vignette) that I was using to test a new pipeline; the subset of the dataset I'm working with only has one condition. I ran the following code to generate trajectories and pseudotime values, deviating from that vignette after preprocessing:

erythroid.sce <- as.SingleCellExperiment(erythroid, assay = "ATAC")

erythroid.sce %<>% slingshot(
  reducedDim = "UMAP", 
  clusterLabels = erythroid.sce$seurat_clusters, 
  start.clus = "1", 
  extend = "n"
)

erythroid.sce <- fitGAM(erythroid.sce, parallel = TRUE)

The fitGAM() step took about 14 hours to run with 512g memory and 16 CPUs allocated on an HPC cluster, so I saved the object as a .rds immediately after and loaded it with the same memory, time, lscratch, and CPU allocations the following day. When I tried to run associationTest() downstream of that, I got the following:

> erythroid_ATres <- associationTest(erythroid.sce)
Error: value for ‘L’ not found

I used register(MulticoreParam(workers = future::availableCores()), default = TRUE) at the beginning of the RMarkdown file I'm using to test rather than a BPPARAM object for multithreading. Provided is the session info:

> sessionInfo()
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /usr/local/intel/compilers_and_libraries_2020.2.254/linux/mkl/lib/intel64_lin/libmkl_rt.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8      
 [8] LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] tradeSeq_1.10.0             Signac_1.8.0                SingleCellExperiment_1.20.0 SummarizedExperiment_1.28.0 Biobase_2.58.0              GenomicRanges_1.50.1       
 [7] GenomeInfoDb_1.34.3         IRanges_2.32.0              S4Vectors_0.36.0            BiocGenerics_0.44.0         MatrixGenerics_1.10.0       matrixStats_0.63.0         
[13] SeuratObject_4.1.3          sp_1.5-1                   

loaded via a namespace (and not attached):
 [1] viridis_0.6.2          tidyr_1.2.1            edgeR_3.40.0           viridisLite_0.4.1      splines_4.2.0          assertthat_0.2.1       GenomeInfoDbData_1.2.9 Rsamtools_2.12.0      
 [9] globals_0.16.1         pillar_1.8.1           lattice_0.20-45        glue_1.6.2             limma_3.54.0           digest_0.6.30          RColorBrewer_1.1-3     XVector_0.38.0        
[17] colorspace_2.0-3       Matrix_1.5-1           pkgconfig_2.0.3        listenv_0.8.0          zlibbioc_1.44.0        purrr_0.3.5            patchwork_1.1.2        scales_1.2.1          
[25] BiocParallel_1.32.1    tibble_3.1.8           mgcv_1.8-40            generics_0.1.3         ggplot2_3.4.0          pbapply_1.6-0          cli_3.4.1              crayon_1.5.2          
[33] magrittr_2.0.3         slingshot_2.5.2        future_1.29.0          fansi_1.0.3            parallelly_1.32.1      nlme_3.1-157           progressr_0.11.0       tools_4.2.0           
[41] data.table_1.14.6      lifecycle_1.0.3        munsell_0.5.0          locfit_1.5-9.6         irlba_2.3.5.1          DelayedArray_0.24.0    Biostrings_2.64.1      compiler_4.2.0        
[49] RcppRoll_0.3.0         rlang_1.0.6            grid_4.2.0             RCurl_1.98-1.9         TrajectoryUtils_1.6.0  rstudioapi_0.14        igraph_1.3.5           bitops_1.0-7          
[57] gtable_0.3.1           codetools_0.2-18       DBI_1.1.3              R6_2.5.1               gridExtra_2.3          dplyr_1.0.10           future.apply_1.10.0    utf8_1.2.2            
[65] fastmatch_1.1-3        princurve_2.1.6        stringi_1.7.8          parallel_4.2.0         Rcpp_1.0.9             vctrs_0.5.1            tidyselect_1.2.0