stuart-lab / signac

R toolkit for the analysis of single-cell chromatin data
https://stuartlab.org/signac/
Other
318 stars 86 forks source link

Can't upload CoveragePlot() to shinyapps.io because fragment file can not be found #791

Closed matasV99 closed 3 years ago

matasV99 commented 3 years ago

I am trying to make a shiny app that visualizes the tracks using the CoveragePlot() function from Signac. Unfortunately, even though my shiny app works perfectly well locally, there seems to be an issue with the way CoveragePlot() accesses fragments (by using a direct path) that does not allow the shiny app to access the fragment files on the shinyapps.io server.

Here is the error that I can see in my logs in shinyapps.io:

2021-09-12T06:34:28.226862+00:00 shinyapps[4635196]: Did you forget to set the RETICULATE_PYTHON environment variable in your .Rprofile before publishing? 2021-09-12T06:34:28.226955+00:00 shinyapps[4635196]: Using pandoc: /opt/connect/ext/pandoc/2.11 2021-09-12T06:34:28.326451+00:00 shinyapps[4635196]: Using jsonlite for JSON processing 2021-09-12T06:34:28.329975+00:00 shinyapps[4635196]: 2021-09-12T06:34:28.329976+00:00 shinyapps[4635196]: Starting R with process ID: '26' 2021-09-12T06:34:29.984730+00:00 shinyapps[4635196]: Attaching SeuratObject 2021-09-12T06:34:51.800431+00:00 shinyapps[system]: Out of memory! 2021-09-12T06:36:19.425605+00:00 shinyapps[4635196]: LANG: C.UTF-8 2021-09-12T06:36:19.420887+00:00 shinyapps[4635196]: Running on host: 0ed6a28d666e 2021-09-12T06:36:19.425573+00:00 shinyapps[4635196]: Server version: 1.8.6.1 2021-09-12T06:36:19.425686+00:00 shinyapps[4635196]: jsonlite version: 1.7.2 2021-09-12T06:36:19.425607+00:00 shinyapps[4635196]: R version: 4.0.1 2021-09-12T06:36:19.425672+00:00 shinyapps[4635196]: httpuv version: 1.6.3 2021-09-12T06:36:19.425671+00:00 shinyapps[4635196]: shiny version: 1.6.0 2021-09-12T06:36:19.425687+00:00 shinyapps[4635196]: htmltools version: 0.5.2 2021-09-12T06:36:19.425685+00:00 shinyapps[4635196]: rmarkdown version: (none) 2021-09-12T06:36:19.425685+00:00 shinyapps[4635196]: knitr version: (none) 2021-09-12T06:36:19.425686+00:00 shinyapps[4635196]: RJSONIO version: (none) 2021-09-12T06:36:20.490633+00:00 shinyapps[4635196]: warning: using reticulate but python was not specified; will use python at /usr/bin/python3 2021-09-12T06:36:20.490635+00:00 shinyapps[4635196]: Did you forget to set the RETICULATE_PYTHON environment variable in your .Rprofile before publishing? 2021-09-12T06:36:20.490749+00:00 shinyapps[4635196]: Using pandoc: /opt/connect/ext/pandoc/2.11 2021-09-12T06:36:20.592704+00:00 shinyapps[4635196]: Using jsonlite for JSON processing 2021-09-12T06:36:20.596295+00:00 shinyapps[4635196]: 2021-09-12T06:36:20.596298+00:00 shinyapps[4635196]: Starting R with process ID: '24' 2021-09-12T06:36:22.289098+00:00 shinyapps[4635196]: Attaching SeuratObject 2021-09-12T06:36:45.280444+00:00 shinyapps[4635196]: 2021-09-12T06:36:45.280447+00:00 shinyapps[4635196]: Listening on http://127.0.0.1:46350 2021-09-12T06:36:49.094369+00:00 shinyapps[4635196]: Warning: Error in : TabixFile: file(s) do not exist: 2021-09-12T06:36:49.094371+00:00 shinyapps[4635196]: ‘/n/scratch3/users/m/mav508/Aug28_2021/MV_DRG_ATAC/atac_v1_pbmc_10k_fragments.tsv.gz’ 2021-09-12T06:36:49.102990+00:00 shinyapps[4635196]: 187: value[[3L]] 2021-09-12T06:36:49.102990+00:00 shinyapps[4635196]: 185: tryCatchList 2021-09-12T06:36:49.094372+00:00 shinyapps[4635196]: ‘/n/scratch3/users/m/mav508/Aug28_2021/MV_DRG_ATAC/atac_v1_pbmc_10k_fragments.tsv.gz.tbi’ 2021-09-12T06:36:49.103001+00:00 shinyapps[4635196]: 182: CutMatrix 2021-09-12T06:36:49.103002+00:00 shinyapps[4635196]: 181: SingleCoveragePlot 2021-09-12T06:36:49.102989+00:00 shinyapps[4635196]: 188: stop 2021-09-12T06:36:49.102991+00:00 shinyapps[4635196]: 183: TabixFile 2021-09-12T06:36:49.102990+00:00 shinyapps[4635196]: 186: tryCatchOne 2021-09-12T06:36:49.103003+00:00 shinyapps[4635196]: 177: func 2021-09-12T06:36:49.103002+00:00 shinyapps[4635196]: 180: CoveragePlot 2021-09-12T06:36:49.102991+00:00 shinyapps[4635196]: 184: tryCatch 2021-09-12T06:36:49.103003+00:00 shinyapps[4635196]: 137: drawPlot 2021-09-12T06:36:49.103004+00:00 shinyapps[4635196]: 123: 2021-09-12T06:36:49.103003+00:00 shinyapps[4635196]: 179: renderPlot 2021-09-12T06:36:49.103004+00:00 shinyapps[4635196]: 107: drawReactive 2021-09-12T06:36:49.103004+00:00 shinyapps[4635196]: 94: renderFunc 2021-09-12T06:36:49.103004+00:00 shinyapps[4635196]: 93: output$plot 2021-09-12T06:36:49.103005+00:00 shinyapps[4635196]: 13: runApp 2021-09-12T06:36:49.103005+00:00 shinyapps[4635196]: 12: fn 2021-09-12T06:36:49.103005+00:00 shinyapps[4635196]: 7: connect$retry 2021-09-12T06:36:49.103006+00:00 shinyapps[4635196]: 6: eval 2021-09-12T06:36:49.103006+00:00 shinyapps[4635196]: 5: eval

Here is some information about my app folder structure: image

Is there any workaround around this? I have tried to input the fragment file into the /data folder in my shiny app, I have tried to modify the path to fragment files so that CreateChromatinAssay accesses "data/fragments.gz" file in a manner that's self contained in my shiny app. Nothing worked. Maybe it is possible to change up the source code of the CoveragePlot() somehow to be able to specify where to access the fragment file from? That way I could just call on it in my shiny app?!?


I am going to try my best to reproduce my problem using a pbmc Seurat object from this vignette: https://satijalab.org/signac/articles/pbmc_vignette.html

Here is the script to generate the Seurat object I want to use for my shiny app:

#load correct libraries
myPaths <- c("/home/mav508/R/library-lite/")
.libPaths(myPaths)
.libPaths()

library(Signac)
library(Seurat)
library(GenomeInfoDb)
library(EnsDb.Hsapiens.v75)
library(ggplot2)
library(patchwork)
set.seed(1234)

setwd("/n/scratch3/users/m/mav508/Aug28_2021/MV_DRG_ATAC/app-test_Tracks")

counts <- Read10X_h5(filename = "atac_v1_pbmc_10k_filtered_peak_bc_matrix.h5")
metadata <- read.csv(
  file = "atac_v1_pbmc_10k_singlecell.csv",
  header = TRUE,
  row.names = 1
)

chrom_assay <- CreateChromatinAssay(
  counts = counts,
  sep = c(":", "-"),
  genome = 'hg19',
  fragments = 'atac_v1_pbmc_10k_fragments.tsv.gz',
  min.cells = 10,
  min.features = 200
)

pbmc <- CreateSeuratObject(
  counts = chrom_assay,
  assay = "peaks",
  meta.data = metadata
)

# extract gene annotations from EnsDb
annotations <- GetGRangesFromEnsDb(ensdb = EnsDb.Hsapiens.v75)

# change to UCSC style since the data was mapped to hg19
seqlevelsStyle(annotations) <- 'UCSC'
genome(annotations) <- "hg19"

# add the gene information to the object
Annotation(pbmc) <- annotations

# compute nucleosome signal score per cell
pbmc <- NucleosomeSignal(object = pbmc)

# compute TSS enrichment score per cell
pbmc <- TSSEnrichment(object = pbmc, fast = FALSE)

# add blacklist ratio and fraction of reads in peaks
pbmc$pct_reads_in_peaks <- pbmc$peak_region_fragments / pbmc$passed_filters * 100
pbmc$blacklist_ratio <- pbmc$blacklist_region_fragments / pbmc$peak_region_fragments

pbmc$high.tss <- ifelse(pbmc$TSS.enrichment > 2, 'High', 'Low')
pbmc$nucleosome_group <- ifelse(pbmc$nucleosome_signal > 4, 'NS > 4', 'NS < 4')

pbmc <- subset(
  x = pbmc,
  subset = peak_region_fragments > 3000 &
    peak_region_fragments < 20000 &
    pct_reads_in_peaks > 15 &
    blacklist_ratio < 0.05 &
    nucleosome_signal < 4 &
    TSS.enrichment > 2
)

pbmc <- RunTFIDF(pbmc)
pbmc <- FindTopFeatures(pbmc, min.cutoff = 'q0')
pbmc <- RunSVD(pbmc)

pbmc <- RunUMAP(object = pbmc, reduction = 'lsi', dims = 2:30)
pbmc <- FindNeighbors(object = pbmc, reduction = 'lsi', dims = 2:30)
pbmc <- FindClusters(object = pbmc, verbose = FALSE, algorithm = 3)

gene.activities <- GeneActivity(pbmc)
# add the gene activity matrix to the Seurat object as a new assay and normalize it
pbmc[['RNA']] <- CreateAssayObject(counts = gene.activities)
pbmc <- NormalizeData(
  object = pbmc,
  assay = 'RNA',
  normalization.method = 'LogNormalize',
  scale.factor = median(pbmc$nCount_RNA)
)

DefaultAssay(pbmc) <- 'RNA'
# Load the pre-processed scRNA-seq data for PBMCs
pbmc_rna <- readRDS("pbmc_10k_v3.rds")

transfer.anchors <- FindTransferAnchors(
  reference = pbmc_rna,
  query = pbmc,
  reduction = 'cca'
  )

predicted.labels <- TransferData(
  anchorset = transfer.anchors,
  refdata = pbmc_rna$celltype,
  weight.reduction = pbmc[['lsi']],
  dims = 2:30
)

pbmc <- AddMetaData(object = pbmc, metadata = predicted.labels)

pbmc <- subset(pbmc, idents = 14, invert = TRUE)
pbmc <- RenameIdents(
  object = pbmc,
  '0' = 'CD14 Mono',
  '1' = 'CD4 Memory',
  '2' = 'CD8 Effector',
  '3' = 'CD4 Naive',
  '4' = 'CD14 Mono',
  '5' = 'DN T',
  '6' = 'CD8 Naive',
  '7' = 'NK CD56Dim',
  '8' = 'pre-B',
  '9' = 'CD16 Mono',
  '10' = 'pro-B',
  '11' = 'DC',
  '12' = 'NK CD56bright',
  '13' = 'pDC'
)

# change back to working with peaks instead of gene activities
DefaultAssay(pbmc) <- 'peaks'

levels(pbmc) <- c("CD4 Naive","CD4 Memory","CD8 Naive","CD8 Effector","DN T","NK CD56bright","NK CD56Dim","pre-B",'pro-B',"pDC","DC","CD14 Mono",'CD16 Mono')

# save pbmc files
saveRDS(pbmc, file = "pbmc_atac.Rds")
saveRDS(pbmc_rna, file = "pbmc_rna.Rds")

Here is my app.R .

myPaths <- c("/home/mav508/R/library-lite/")
.libPaths(myPaths)
.libPaths()
library(shiny)
library(Seurat)
library(Signac)
library(ggplot2)
setwd("/n/scratch3/users/m/mav508/Aug28_2021/MV_DRG_ATAC/app-test_Tracks")

pbmc_atac <- readRDS(file = "pbmc_atac.Rds")
DefaultAssay(pbmc_atac) = "peaks"

ui <- fluidPage(
  titlePanel("PBMC open-chromatin regions"),
  sidebarLayout(
    sidebarPanel(
      helpText("PBMC fragments per cell-type visualized as peaks
               alongside the RNA expression data for a selected gene. It takes about 30-45 seconds for the plot to generate."),
      # textInput(inputID = "name", label = "Select the name of the gene of interest (e.g.PPCDC, RXRA)", value = "PPCDC"),
      # textInput(inputID = "region", label = "input the genomic region of the gene of interest (e.g. chr2-113581628-113594911; chr15-75334903-75336779)", value = " chr2-113581628-113594911"),
      # numericInput(inputID = "rangeup", label = "number of kilobases upstream of your region of interest", min = 0, max = 100000000, value = 20000),
      # numericInput(inputID = "rangedown", label = "number of kilobases downstream of your region of interest", min = 0, max = 100000000, value = 20000),
      submitButton(text = "Submit")
    ),

    mainPanel(plotOutput("tracks"))
  )
)

server <- function(input, output, session) {
  output$tracks <- renderPlot({
    CoveragePlot(
      object = pbmc_atac,
      region = "chr2-113581628-113594911", #input$region 
      features = "PPCDC", #input$name
      peaks = TRUE,
      extend.upstream = 20000, #input$rangeup
      extend.downstream = 20000 #input$rangedown
    )
  })

}
shinyApp(ui = ui, server = server)

My sessioninfo()

Here is sessionInfo() R version 4.0.1 (2020-06-06) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core) Matrix products: default BLAS/LAPACK: /n/app/openblas/0.2.19/lib/libopenblas_core2p-r0.2.19.so locale: [1] C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] EnsDb.Hsapiens.v75_2.99.0 ensembldb_2.14.1 AnnotationFilter_1.14.0 GenomicFeatures_1.42.3 [5] AnnotationDbi_1.52.0 Biobase_2.50.0 GenomicRanges_1.42.0 patchwork_1.1.1 [9] GenomeInfoDb_1.26.7 IRanges_2.24.1 S4Vectors_0.28.1 BiocGenerics_0.36.1 [13] ggplot2_3.3.5 Signac_1.3.0 SeuratObject_4.0.2 Seurat_4.0.3 [17] shiny_1.6.0 loaded via a namespace (and not attached): [1] utf8_1.2.1 reticulate_1.20 tidyselect_1.1.1 RSQLite_2.2.7 [5] htmlwidgets_1.5.3 grid_4.0.1 docopt_0.7.1 BiocParallel_1.24.1 [9] Rtsne_0.15 munsell_0.5.0 codetools_0.2-16 ica_1.0-2 [13] future_1.21.0 miniUI_0.1.1.1 withr_2.4.2 colorspace_2.0-2 [17] knitr_1.33 rstudioapi_0.13 ROCR_1.0-11 tensor_1.5 [21] listenv_0.8.0 labeling_0.4.2 MatrixGenerics_1.2.1 slam_0.1-48 [25] GenomeInfoDbData_1.2.4 polyclip_1.10-0 bit64_4.0.5 farver_2.1.0 [29] parallelly_1.27.0 vctrs_0.3.8 generics_0.1.0 xfun_0.24 [33] biovizBase_1.38.0 BiocFileCache_1.14.0 lsa_0.73.2 ggseqlogo_0.1 [37] R6_2.5.0 hdf5r_1.3.3 bitops_1.0-7 spatstat.utils_2.2-0 [41] cachem_1.0.5 DelayedArray_0.16.3 assertthat_0.2.1 promises_1.2.0.1 [45] scales_1.1.1 nnet_7.3-14 gtable_0.3.0 globals_0.14.0 [49] goftest_1.2-2 rlang_0.4.11 RcppRoll_0.3.0 splines_4.0.1 [53] rtracklayer_1.50.0 lazyeval_0.2.2 dichromat_2.0-0 checkmate_2.0.0 [57] spatstat.geom_2.2-0 BiocManager_1.30.16 reshape2_1.4.4 abind_1.4-5 [61] backports_1.2.1 httpuv_1.6.1 Hmisc_4.5-0 tools_4.0.1 [65] ellipsis_0.3.2 spatstat.core_2.2-0 RColorBrewer_1.1-2 ggridges_0.5.3 [69] Rcpp_1.0.7 plyr_1.8.6 base64enc_0.1-3 progress_1.2.2 [73] zlibbioc_1.36.0 purrr_0.3.4 RCurl_1.98-1.3 prettyunits_1.1.1 [77] rpart_4.1-15 openssl_1.4.4 deldir_0.2-10 pbapply_1.4-3 [81] cowplot_1.1.1 zoo_1.8-9 SummarizedExperiment_1.20.0 ggrepel_0.9.1 [85] cluster_2.1.0 magrittr_2.0.1 RSpectra_0.16-0 data.table_1.14.0 [89] scattermore_0.7 lmtest_0.9-38 RANN_2.6.1 SnowballC_0.7.0 [93] ProtGenerics_1.22.0 fitdistrplus_1.1-5 matrixStats_0.59.0 hms_1.1.0 [97] mime_0.11 xtable_1.8-4 XML_3.99-0.6 jpeg_0.1-8.1 [101] sparsesvd_0.2 gridExtra_2.3 compiler_4.0.1 biomaRt_2.46.3 [105] tibble_3.1.2 KernSmooth_2.23-17 crayon_1.4.1 htmltools_0.5.1.1 [109] mgcv_1.8-31 later_1.2.0 Formula_1.2-4 tidyr_1.1.3 [113] DBI_1.1.1 tweenr_1.0.2 dbplyr_2.1.1 MASS_7.3-51.6 [117] rappdirs_0.3.3 Matrix_1.3-4 igraph_1.2.6 pkgconfig_2.0.3 [121] GenomicAlignments_1.26.0 foreign_0.8-80 plotly_4.9.4.1 spatstat.sparse_2.0-0 [125] xml2_1.3.2 XVector_0.30.0 VariantAnnotation_1.36.0 stringr_1.4.0 [129] digest_0.6.27 sctransform_0.3.2 RcppAnnoy_0.0.18 spatstat.data_2.1-0 [133] Biostrings_2.58.0 leiden_0.3.8 fastmatch_1.1-0 htmlTable_2.2.1 [137] uwot_0.1.10 curl_4.3.2 Rsamtools_2.6.0 lifecycle_1.0.0 [141] nlme_3.1-148 jsonlite_1.7.2 BSgenome_1.58.0 viridisLite_0.4.0 [145] askpass_1.1 fansi_0.5.0 pillar_1.6.1 lattice_0.20-41 [149] fastmap_1.1.0 httr_1.4.2 survival_3.1-12 glue_1.4.2 [153] qlcMatrix_0.9.7 png_0.1-7 bit_4.0.4 ggforce_0.3.3 [157] stringi_1.7.3 blob_1.2.1 latticeExtra_0.6-29 memoise_2.0.0 [161] dplyr_1.0.7 irlba_2.3.3 future.apply_1.7.0

Thank you for your help.

timoast commented 3 years ago

If you move the Seurat object to a different server, you will need to update the path to the fragment files in the ChromatinAssay. See https://satijalab.org/signac/articles/data_structures.html#changing-the-fragment-file-path-in-an-existing-fragment-object-1

Alternatively you can store the fragment files on a remote server accessible by http or ftp, see https://satijalab.org/signac/articles/data_structures.html#using-remote-fragment-files-1