stuart-lab / signac

R toolkit for the analysis of single-cell chromatin data
https://stuartlab.org/signac/
Other
328 stars 88 forks source link

Need help building shiny app with CoveragePlot #769

Closed matasV99 closed 3 years ago

matasV99 commented 3 years ago

Tldr: How would I put coveragebrowser or CoveragePlot() on a shiny app to share online?

More detailed explanation: I am trying to build a shiny app that visualizes ATAC tracks for different sub-types of my data as well as visualize RNAseq gene expression data of target gene with CoveragePlot(). I would like to have TextInputs define the feature, region, extend.upstream and extend.downstream parameters of CoveragePlot().

I am going to try my best to reproduce my problem using a pbmc Seurat object from this vignette: https://satijalab.org/signac/articles/pbmc_vignette.html

Here is the script to generate the Seurat object I want to use for my shiny app:

#load correct libraries
myPaths <- c("/home/mav508/R/library-lite/")
.libPaths(myPaths)
.libPaths()

library(Signac)
library(Seurat)
library(GenomeInfoDb)
library(EnsDb.Hsapiens.v75)
library(ggplot2)
library(patchwork)
set.seed(1234)

setwd("/n/scratch3/users/m/mav508/Aug28_2021/MV_DRG_ATAC/app-test_Tracks")

counts <- Read10X_h5(filename = "atac_v1_pbmc_10k_filtered_peak_bc_matrix.h5")
metadata <- read.csv(
  file = "atac_v1_pbmc_10k_singlecell.csv",
  header = TRUE,
  row.names = 1
)

chrom_assay <- CreateChromatinAssay(
  counts = counts,
  sep = c(":", "-"),
  genome = 'hg19',
  fragments = 'atac_v1_pbmc_10k_fragments.tsv.gz',
  min.cells = 10,
  min.features = 200
)

pbmc <- CreateSeuratObject(
  counts = chrom_assay,
  assay = "peaks",
  meta.data = metadata
)

# extract gene annotations from EnsDb
annotations <- GetGRangesFromEnsDb(ensdb = EnsDb.Hsapiens.v75)

# change to UCSC style since the data was mapped to hg19
seqlevelsStyle(annotations) <- 'UCSC'
genome(annotations) <- "hg19"

# add the gene information to the object
Annotation(pbmc) <- annotations

# compute nucleosome signal score per cell
pbmc <- NucleosomeSignal(object = pbmc)

# compute TSS enrichment score per cell
pbmc <- TSSEnrichment(object = pbmc, fast = FALSE)

# add blacklist ratio and fraction of reads in peaks
pbmc$pct_reads_in_peaks <- pbmc$peak_region_fragments / pbmc$passed_filters * 100
pbmc$blacklist_ratio <- pbmc$blacklist_region_fragments / pbmc$peak_region_fragments

pbmc$high.tss <- ifelse(pbmc$TSS.enrichment > 2, 'High', 'Low')
pbmc$nucleosome_group <- ifelse(pbmc$nucleosome_signal > 4, 'NS > 4', 'NS < 4')

pbmc <- subset(
  x = pbmc,
  subset = peak_region_fragments > 3000 &
    peak_region_fragments < 20000 &
    pct_reads_in_peaks > 15 &
    blacklist_ratio < 0.05 &
    nucleosome_signal < 4 &
    TSS.enrichment > 2
)

pbmc <- RunTFIDF(pbmc)
pbmc <- FindTopFeatures(pbmc, min.cutoff = 'q0')
pbmc <- RunSVD(pbmc)

pbmc <- RunUMAP(object = pbmc, reduction = 'lsi', dims = 2:30)
pbmc <- FindNeighbors(object = pbmc, reduction = 'lsi', dims = 2:30)
pbmc <- FindClusters(object = pbmc, verbose = FALSE, algorithm = 3)

gene.activities <- GeneActivity(pbmc)
# add the gene activity matrix to the Seurat object as a new assay and normalize it
pbmc[['RNA']] <- CreateAssayObject(counts = gene.activities)
pbmc <- NormalizeData(
  object = pbmc,
  assay = 'RNA',
  normalization.method = 'LogNormalize',
  scale.factor = median(pbmc$nCount_RNA)
)

DefaultAssay(pbmc) <- 'RNA'
# Load the pre-processed scRNA-seq data for PBMCs
pbmc_rna <- readRDS("pbmc_10k_v3.rds")

transfer.anchors <- FindTransferAnchors(
  reference = pbmc_rna,
  query = pbmc,
  reduction = 'cca'
  )

predicted.labels <- TransferData(
  anchorset = transfer.anchors,
  refdata = pbmc_rna$celltype,
  weight.reduction = pbmc[['lsi']],
  dims = 2:30
)

pbmc <- AddMetaData(object = pbmc, metadata = predicted.labels)

pbmc <- subset(pbmc, idents = 14, invert = TRUE)
pbmc <- RenameIdents(
  object = pbmc,
  '0' = 'CD14 Mono',
  '1' = 'CD4 Memory',
  '2' = 'CD8 Effector',
  '3' = 'CD4 Naive',
  '4' = 'CD14 Mono',
  '5' = 'DN T',
  '6' = 'CD8 Naive',
  '7' = 'NK CD56Dim',
  '8' = 'pre-B',
  '9' = 'CD16 Mono',
  '10' = 'pro-B',
  '11' = 'DC',
  '12' = 'NK CD56bright',
  '13' = 'pDC'
)

# change back to working with peaks instead of gene activities
DefaultAssay(pbmc) <- 'peaks'

levels(pbmc) <- c("CD4 Naive","CD4 Memory","CD8 Naive","CD8 Effector","DN T","NK CD56bright","NK CD56Dim","pre-B",'pro-B',"pDC","DC","CD14 Mono",'CD16 Mono')

# save pbmc files
saveRDS(pbmc, file = "pbmc_atac.Rds")
saveRDS(pbmc_rna, file = "pbmc_rna.Rds")

Here is my app.R . I would like to have TextInputs define the feature, region, extend.upstream and extend.downstream parameters of CoveragePlot() instead of the manually typed out parameters here.

myPaths <- c("/home/mav508/R/library-lite/")
.libPaths(myPaths)
.libPaths()
library(shiny)
library(Seurat)
library(Signac)
library(ggplot2)
setwd("/n/scratch3/users/m/mav508/Aug28_2021/MV_DRG_ATAC/app-test_Tracks")

pbmc_atac <- readRDS(file = "pbmc_atac.Rds")
DefaultAssay(pbmc_atac) = "peaks"

ui <- fluidPage(
  titlePanel("PBMC open-chromatin regions"),
  sidebarLayout(
    sidebarPanel(
      helpText("PBMC fragments per cell-type visualized as peaks
               alongside the RNA expression data for a selected gene. It takes about 30-45 seconds for the plot to generate."),
      # textInput(inputID = "name", label = "Select the name of the gene of interest (e.g.PPCDC, RXRA)", value = "PPCDC"),
      # textInput(inputID = "region", label = "input the genomic region of the gene of interest (e.g. chr2-113581628-113594911; chr15-75334903-75336779)", value = " chr2-113581628-113594911"),
      # numericInput(inputID = "rangeup", label = "number of kilobases upstream of your region of interest", min = 0, max = 100000000, value = 20000),
      # numericInput(inputID = "rangedown", label = "number of kilobases downstream of your region of interest", min = 0, max = 100000000, value = 20000),
      submitButton(text = "Submit")
    ),

    mainPanel(plotOutput("tracks"))
  )
)

server <- function(input, output, session) {
  output$tracks <- renderPlot({
    CoveragePlot(
      object = pbmc_atac,
      region = "chr2-113581628-113594911", #input$region 
      features = "PPCDC", #input$name
      peaks = TRUE,
      extend.upstream = 20000, #input$rangeup
      extend.downstream = 20000 #input$rangedown
    )
  })

}
shinyApp(ui = ui, server = server)

Here is what the output looks like: image I would like it to have customizable text inputs that would be fed to the function: image

Here is sessionInfo() R version 4.0.1 (2020-06-06) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core) Matrix products: default BLAS/LAPACK: /n/app/openblas/0.2.19/lib/libopenblas_core2p-r0.2.19.so locale: [1] C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages: [1] EnsDb.Hsapiens.v75_2.99.0 ensembldb_2.14.1 AnnotationFilter_1.14.0 GenomicFeatures_1.42.3
[5] AnnotationDbi_1.52.0 Biobase_2.50.0 GenomicRanges_1.42.0 patchwork_1.1.1
[9] GenomeInfoDb_1.26.7 IRanges_2.24.1 S4Vectors_0.28.1 BiocGenerics_0.36.1
[13] ggplot2_3.3.5 Signac_1.3.0 SeuratObject_4.0.2 Seurat_4.0.3
[17] shiny_1.6.0
loaded via a namespace (and not attached): [1] utf8_1.2.1 reticulate_1.20 tidyselect_1.1.1 RSQLite_2.2.7
[5] htmlwidgets_1.5.3 grid_4.0.1 docopt_0.7.1 BiocParallel_1.24.1
[9] Rtsne_0.15 munsell_0.5.0 codetools_0.2-16 ica_1.0-2
[13] future_1.21.0 miniUI_0.1.1.1 withr_2.4.2 colorspace_2.0-2
[17] knitr_1.33 rstudioapi_0.13 ROCR_1.0-11 tensor_1.5
[21] listenv_0.8.0 labeling_0.4.2 MatrixGenerics_1.2.1 slam_0.1-48
[25] GenomeInfoDbData_1.2.4 polyclip_1.10-0 bit64_4.0.5 farver_2.1.0
[29] parallelly_1.27.0 vctrs_0.3.8 generics_0.1.0 xfun_0.24
[33] biovizBase_1.38.0 BiocFileCache_1.14.0 lsa_0.73.2 ggseqlogo_0.1
[37] R6_2.5.0 hdf5r_1.3.3 bitops_1.0-7 spatstat.utils_2.2-0
[41] cachem_1.0.5 DelayedArray_0.16.3 assertthat_0.2.1 promises_1.2.0.1
[45] scales_1.1.1 nnet_7.3-14 gtable_0.3.0 globals_0.14.0
[49] goftest_1.2-2 rlang_0.4.11 RcppRoll_0.3.0 splines_4.0.1
[53] rtracklayer_1.50.0 lazyeval_0.2.2 dichromat_2.0-0 checkmate_2.0.0
[57] spatstat.geom_2.2-0 BiocManager_1.30.16 reshape2_1.4.4 abind_1.4-5
[61] backports_1.2.1 httpuv_1.6.1 Hmisc_4.5-0 tools_4.0.1
[65] ellipsis_0.3.2 spatstat.core_2.2-0 RColorBrewer_1.1-2 ggridges_0.5.3
[69] Rcpp_1.0.7 plyr_1.8.6 base64enc_0.1-3 progress_1.2.2
[73] zlibbioc_1.36.0 purrr_0.3.4 RCurl_1.98-1.3 prettyunits_1.1.1
[77] rpart_4.1-15 openssl_1.4.4 deldir_0.2-10 pbapply_1.4-3
[81] cowplot_1.1.1 zoo_1.8-9 SummarizedExperiment_1.20.0 ggrepel_0.9.1
[85] cluster_2.1.0 magrittr_2.0.1 RSpectra_0.16-0 data.table_1.14.0
[89] scattermore_0.7 lmtest_0.9-38 RANN_2.6.1 SnowballC_0.7.0
[93] ProtGenerics_1.22.0 fitdistrplus_1.1-5 matrixStats_0.59.0 hms_1.1.0
[97] mime_0.11 xtable_1.8-4 XML_3.99-0.6 jpeg_0.1-8.1
[101] sparsesvd_0.2 gridExtra_2.3 compiler_4.0.1 biomaRt_2.46.3
[105] tibble_3.1.2 KernSmooth_2.23-17 crayon_1.4.1 htmltools_0.5.1.1
[109] mgcv_1.8-31 later_1.2.0 Formula_1.2-4 tidyr_1.1.3
[113] DBI_1.1.1 tweenr_1.0.2 dbplyr_2.1.1 MASS_7.3-51.6
[117] rappdirs_0.3.3 Matrix_1.3-4 igraph_1.2.6 pkgconfig_2.0.3
[121] GenomicAlignments_1.26.0 foreign_0.8-80 plotly_4.9.4.1 spatstat.sparse_2.0-0
[125] xml2_1.3.2 XVector_0.30.0 VariantAnnotation_1.36.0 stringr_1.4.0
[129] digest_0.6.27 sctransform_0.3.2 RcppAnnoy_0.0.18 spatstat.data_2.1-0
[133] Biostrings_2.58.0 leiden_0.3.8 fastmatch_1.1-0 htmlTable_2.2.1
[137] uwot_0.1.10 curl_4.3.2 Rsamtools_2.6.0 lifecycle_1.0.0
[141] nlme_3.1-148 jsonlite_1.7.2 BSgenome_1.58.0 viridisLite_0.4.0
[145] askpass_1.1 fansi_0.5.0 pillar_1.6.1 lattice_0.20-41
[149] fastmap_1.1.0 httr_1.4.2 survival_3.1-12 glue_1.4.2
[153] qlcMatrix_0.9.7 png_0.1-7 bit_4.0.4 ggforce_0.3.3
[157] stringi_1.7.3 blob_1.2.1 latticeExtra_0.6-29 memoise_2.0.0
[161] dplyr_1.0.7 irlba_2.3.3 future.apply_1.7.0

N.B. an alternative to using CoveragePlot() would be to use a shiny app made by the Satija lab called CoverageBrowser() . My question then would be - how do I make CoverageBrowser() into a web app?

timoast commented 3 years ago

Hi @matasV99

It looks like this is more of a general question about how to make a shiny app in R. Unfortunately we don't have the bandwidth to help with general R queries like this, I'd suggest asking for help on a site like stackoverflow. You can also look at the code for CoverageBrowser to see how I did it in Signac.

If you want to create a web app, a better approach might be to create separate bigwig files for each cell type and use IGV.js or Jbrowse2, which will be much more responsive and easier to deploy. You can do this using the FilterCells() function to create a separate fragment file for each group of cells, and then convert to bedgraph and bigwig format. The advantage of CoveragePlot is that it allows cells to be dynamically grouped into different groups in an interactive session. For a public web app, you probably have a fixed set of cell groupings that you're interested in (cell type annotations or cell clusters for example).