GreenleafLab / chromVAR

chromatin Variability Across Regions (of the genome!)
https://greenleaflab.github.io/chromVAR/
Other
159 stars 38 forks source link

Error addGCBias: trying to load regions beyond the boundaries of non-circular sequence "chr13" #110

Open massonix opened 9 months ago

massonix commented 9 months ago

Hi,

Thanks for developing this wonderful package. I am running chromVAR on my bulk ATAC-seq data. After running the following code chunk:

fragment_counts <- addGCBias(
  fragment_counts,
  genome = BSgenome.Hsapiens.UCSC.hg38
)

I get the following error

Error in loadFUN(x, seqname, ranges) : 
  trying to load regions beyond the boundaries of non-circular sequence
  "chr13"

This is my session info:

> sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /gpfs/commons/home/rmassoni/.conda/envs/test/lib/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/New_York
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] here_1.0.1                        BSgenome.Hsapiens.UCSC.hg38_1.4.5
 [3] BSgenome_1.70.2                   rtracklayer_1.62.0               
 [5] BiocIO_1.12.0                     Biostrings_2.70.2                
 [7] XVector_0.42.0                    motifmatchr_1.24.0               
 [9] JASPAR2020_0.99.10                TFBSTools_1.40.0                 
[11] BiocParallel_1.36.0               EnvStats_2.8.1                   
[13] chromVAR_1.24.0                   DESeq2_1.42.0                    
[15] SummarizedExperiment_1.32.0       Biobase_2.62.0                   
[17] MatrixGenerics_1.14.0             matrixStats_1.2.0                
[19] GenomicRanges_1.54.1              GenomeInfoDb_1.38.6              
[21] IRanges_2.36.0                    S4Vectors_0.40.2                 
[23] BiocGenerics_0.48.1               lubridate_1.9.3                  
[25] forcats_1.0.0                     stringr_1.5.1                    
[27] dplyr_1.1.4                       purrr_1.0.2                      
[29] readr_2.1.5                       tidyr_1.3.1                      
[31] tibble_3.2.1                      ggplot2_3.4.4                    
[33] tidyverse_2.0.0                  

loaded via a namespace (and not attached):
 [1] DBI_1.2.1                   bitops_1.0-7               
 [3] rlang_1.1.3                 magrittr_2.0.3             
 [5] compiler_4.3.2              RSQLite_2.3.5              
 [7] png_0.1-8                   vctrs_0.6.5                
 [9] reshape2_1.4.4              pkgconfig_2.0.3            
[11] crayon_1.5.2                fastmap_1.1.1              
[13] ellipsis_0.3.2              caTools_1.18.2             
[15] utf8_1.2.4                  Rsamtools_2.18.0           
[17] promises_1.2.1              pracma_2.4.4               
[19] tzdb_0.4.0                  DirichletMultinomial_1.44.0
[21] bit_4.0.5                   zlibbioc_1.48.0            
[23] cachem_1.0.8                CNEr_1.38.0                
[25] jsonlite_1.8.8              blob_1.2.4                 
[27] later_1.3.2                 DelayedArray_0.28.0        
[29] parallel_4.3.2              R6_2.5.1                   
[31] stringi_1.8.3               Rcpp_1.0.12                
[33] R.utils_2.12.3              httpuv_1.6.14              
[35] Matrix_1.6-5                timechange_0.3.0           
[37] tidyselect_1.2.0            abind_1.4-5                
[39] yaml_2.3.8                  codetools_0.2-19           
[41] miniUI_0.1.1.1              lattice_0.22-5             
[43] plyr_1.8.9                  KEGGREST_1.42.0            
[45] shiny_1.8.0                 withr_3.0.0                
[47] pillar_1.9.0                DT_0.31                    
[49] plotly_4.10.4               generics_0.1.3             
[51] vroom_1.6.5                 rprojroot_2.0.4            
[53] RCurl_1.98-1.14             hms_1.1.3                  
[55] munsell_0.5.0               scales_1.3.0               
[57] gtools_3.9.5                xtable_1.8-4               
[59] glue_1.7.0                  lazyeval_0.2.2             
[61] seqLogo_1.68.0              tools_4.3.2                
[63] TFMPvalue_0.0.9             data.table_1.15.0          
[65] annotate_1.80.0             locfit_1.5-9.8             
[67] GenomicAlignments_1.38.2    XML_3.99-0.16.1            
[69] poweRlaw_0.80.0             grid_4.3.2                 
[71] AnnotationDbi_1.64.1        colorspace_2.1-0           
[73] GenomeInfoDbData_1.2.11     restfulr_0.0.15            
[75] cli_3.6.2                   fansi_1.0.6                
[77] viridisLite_0.4.2           S4Arrays_1.2.0             
[79] gtable_0.3.4                R.methodsS3_1.8.2          
[81] digest_0.6.34               SparseArray_1.2.3          
[83] rjson_0.2.21                htmlwidgets_1.6.4          
[85] R.oo_1.26.0                 memoise_2.0.1              
[87] htmltools_0.5.7             lifecycle_1.0.4            
[89] httr_1.4.7                  GO.db_3.18.0               
[91] mime_0.12                   bit64_4.0.5  

Could you help figure out how to solve it?

Thanks

Ramon

skkanlei commented 9 months ago

I get the same error! I think this most likely happens because the BSgenome I am using has different chromosome naming to the chromosomes that my data are mapped to. I have used the seqlevelsStyle and keepStandardChromosomes but still get the same error?

My code library(BiocParallel) register(MulticoreParam(8)) # Use 8 cores library(chromVAR) library(motifmatchr) library(SummarizedExperiment) library(Matrix) library(ggplot2) library(BiocParallel) library(BSgenome.Btaurus.UCSC.bosTau9)

peakfile <- c("/home/QiXin/cattle_raw/13T2/atac_peaks.bed") peaks <- readNarrowpeaks(peakfile)

bamfiles <- c("/home/QiXin/cattle_raw/13T2/atac_possorted_bam.bam") fragment_counts <- getCounts(bamfiles, peaks, paired = TRUE, by_rg = F, format = "bam") seqlevelsStyle(fragment_counts) <- "UCSC" head(rowData(fragment_counts)) fragment_counts <- keepStandardChromosomes(fragment_counts, pruning.mode = "coarse")

example_counts <- addGCBias(fragment_counts , genome = BSgenome.Btaurus.UCSC.bosTau9)

sessionInfo() R version 4.3.2 (2023-10-31) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)

Matrix products: default BLAS: /opt/software/R-4.3/lib64/R/lib/libRblas.so LAPACK: /opt/software/R-4.3/lib64/R/lib/libRlapack.so; LAPACK version 3.11.0

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

time zone: Asia/Shanghai tzcode source: system (glibc)

attached base packages: [1] grid stats4 stats graphics grDevices utils datasets methods
[9] base

other attached packages: [1] JASPAR2022_0.99.8 SeuratWrappers_0.2.0
[3] cicero_1.3.9 Gviz_1.46.1
[5] monocle3_1.3.5 SingleCellExperiment_1.24.0
[7] TFBSTools_1.40.0 ggsci_3.0.0
[9] paletteer_1.5.0 RColorBrewer_1.1-3
[11] biovizBase_1.50.0 AnnotationHub_3.10.0
[13] BiocFileCache_2.10.1 dbplyr_2.4.0
[15] EnsDb.Hsapiens.v86_2.99.0 ensembldb_2.26.0
[17] AnnotationFilter_1.26.0 GenomicFeatures_1.54.1
[19] AnnotationDbi_1.64.1 Signac_1.11.0
[21] dplyr_1.1.4 patchwork_1.1.3
[23] SeuratObject_4.1.3 Seurat_4.3.0
[25] ggplot2_3.4.4 Matrix_1.6-1
[27] SummarizedExperiment_1.32.0 Biobase_2.62.0
[29] MatrixGenerics_1.14.0 matrixStats_1.2.0
[31] motifmatchr_1.24.0 chromVAR_1.24.0
[33] BiocParallel_1.36.0 BSgenome.Btaurus.UCSC.bosTau9_1.4.2 [35] BSgenome_1.70.1 rtracklayer_1.62.0
[37] BiocIO_1.12.0 Biostrings_2.70.1
[39] XVector_0.42.0 GenomicRanges_1.54.1
[41] GenomeInfoDb_1.38.4 IRanges_2.36.0
[43] S4Vectors_0.40.2 BiocGenerics_0.48.1

loaded via a namespace (and not attached): [1] R.methodsS3_1.8.2 dichromat_2.0-0.1
[3] progress_1.2.3 urlchecker_1.0.1
[5] nnet_7.3-19 poweRlaw_0.70.6
[7] goftest_1.2-3 DT_0.31
[9] vctrs_0.6.5 spatstat.random_3.2-2
[11] digest_0.6.33 png_0.1-8
[13] ggrepel_0.9.4 deldir_2.0-2
[15] parallelly_1.36.0 MASS_7.3-60
[17] reshape2_1.4.4 httpuv_1.6.13
[19] withr_2.5.2 xfun_0.41
[21] ellipsis_0.3.2 survival_3.5-7
[23] memoise_2.0.1 profvis_0.3.8
[25] zoo_1.8-12 gtools_3.9.5
[27] pbapply_1.7-2 R.oo_1.25.0
[29] Formula_1.2-5 prettyunits_1.2.0
[31] rematch2_2.1.2 KEGGREST_1.42.0
[33] promises_1.2.1 httr_1.4.7
[35] restfulr_0.0.15 globals_0.16.2
[37] fitdistrplus_1.1-11 ps_1.7.5
[39] rstudioapi_0.15.0 miniUI_0.1.1.1
[41] generics_0.1.3 base64enc_0.1-3
[43] processx_3.8.3 curl_5.2.0
[45] fields_15.2 zlibbioc_1.48.0
[47] polyclip_1.10-6 GenomeInfoDbData_1.2.11
[49] SparseArray_1.2.3 interactiveDisplayBase_1.40.0 [51] xtable_1.8-4 stringr_1.5.1
[53] desc_1.4.3 pracma_2.4.4
[55] evaluate_0.23 S4Arrays_1.2.0
[57] hms_1.1.3 irlba_2.3.5.1
[59] colorspace_2.1-0 filelock_1.0.3
[61] ROCR_1.0-11 reticulate_1.34.0
[63] spatstat.data_3.0-3 magrittr_2.0.3
[65] lmtest_0.9-40 readr_2.1.4
[67] later_1.3.2 lattice_0.21-9
[69] spatstat.geom_3.2-7 future.apply_1.11.1
[71] scattermore_1.2 XML_3.99-0.16
[73] cowplot_1.1.2 RcppAnnoy_0.0.21
[75] Hmisc_5.1-1 pillar_1.9.0
[77] nlme_3.1-163 caTools_1.18.2
[79] compiler_4.3.2 stringi_1.8.3
[81] tensor_1.5 minqa_1.2.6
[83] devtools_2.4.5 GenomicAlignments_1.38.0
[85] plyr_1.8.9 crayon_1.5.2
[87] abind_1.4-5 sp_2.1-2
[89] bit_4.0.5 fastmatch_1.1-4
[91] codetools_0.2-19 plotly_4.10.3
[93] mime_0.12 splines_4.3.2
[95] Rcpp_1.0.11 interp_1.1-6
[97] knitr_1.45 blob_1.2.4
[99] utf8_1.2.4 BiocVersion_3.18.1
[101] seqLogo_1.68.0 lme4_1.1-35.1
[103] fs_1.6.3 listenv_0.9.0
[105] checkmate_2.3.1 pkgbuild_1.4.3
[107] tibble_3.2.1 callr_3.7.3
[109] tzdb_0.4.0 pkgconfig_2.0.3
[111] tools_4.3.2 cachem_1.0.8
[113] RSQLite_2.3.4 viridisLite_0.4.2
[115] DBI_1.2.0 fastmap_1.1.1
[117] rmarkdown_2.25 scales_1.3.0
[119] usethis_2.2.2 ica_1.0-3
[121] Rsamtools_2.18.0 BiocManager_1.30.22
[123] dotCall64_1.1-1 VariantAnnotation_1.48.1
[125] RANN_2.6.1 rpart_4.1.21
[127] yaml_2.3.8 VGAM_1.1-9
[129] latticeExtra_0.6-30 foreign_0.8-85
[131] cli_3.6.2 purrr_1.0.2
[133] leiden_0.4.3.1 lifecycle_1.0.4
[135] uwot_0.1.16 sessioninfo_1.2.2
[137] backports_1.4.1 annotate_1.80.0
[139] gtable_0.3.4 rjson_0.2.21
[141] ggridges_0.5.5 progressr_0.14.0
[143] parallel_4.3.2 jsonlite_1.8.8
[145] bitops_1.0-7 bit64_4.0.5
[147] Rtsne_0.17 spatstat.utils_3.0-4
[149] CNEr_1.38.0 R.utils_2.12.3
[151] jjAnno_0.0.3 lazyeval_0.2.2
[153] sceasy_0.0.7 shiny_1.8.0
[155] htmltools_0.5.7 GO.db_3.18.0
[157] sctransform_0.4.1 rappdirs_0.3.3
[159] glue_1.6.2 TFMPvalue_0.0.9
[161] spam_2.10-0 RCurl_1.98-1.13
[163] jpeg_0.1-10 gridExtra_2.3
[165] boot_1.3-28.1 igraph_1.6.0
[167] R6_2.5.1 tidyr_1.3.0
[169] RcppRoll_0.3.0 cluster_2.1.4
[171] pkgload_1.3.3 nloptr_2.0.3
[173] DirichletMultinomial_1.44.0 DelayedArray_0.28.0
[175] tidyselect_1.2.0 ProtGenerics_1.34.0
[177] htmlTable_2.4.2 maps_3.4.2
[179] xml2_1.3.6 future_1.33.1
[181] rsvd_1.0.5 munsell_0.5.0
[183] KernSmooth_2.23-22 data.table_1.14.10
[185] htmlwidgets_1.6.4 biomaRt_2.58.0
[187] rlang_1.1.2 spatstat.sparse_3.0-3
[189] spatstat.explore_3.2-5 remotes_2.4.2.1
[191] fansi_1.0.6