satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.29k stars 915 forks source link

Integrating scRNA-seq and scATAC-seq data Tutorial Question #4402

Closed Chloe-Thangavelu closed 3 years ago

Chloe-Thangavelu commented 3 years ago

Hello I am working on the 2021 Integrating scRNA-seq and scATAC-seq data Tutorial

Everything has worked until the quantify gene activity section. I get an error running the following command:

> gene.activities <- GeneActivity(pbmc.atac, features = VariableFeatures(pbmc.rna))
Extracting gene coordinates
Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'y' in selecting a method for function 'intersect': failed to open file: https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz

When I do traceback I get:

> traceback()
15: h(simpleError(msg, call))
14: .handleSimpleError(function (cond) 
    .Internal(C_tryCatchHelper(addr, 1L, cond)), "failed to open file: https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz", 
        base::quote(open.TabixFile(file)))
13: open.TabixFile(file)
12: open(file)
11: .headerTabix(file, ...)
10: seqnamesTabix(file = tbx)
9: seqnamesTabix(file = tbx)
8: intersect(x = seqnames(x = features), y = seqnamesTabix(file = tbx))
7: keepSeqlevels(x = features, value = intersect(x = seqnames(x = features), 
       y = seqnamesTabix(file = tbx)), pruning.mode = "coarse")
6: SingleFeatureMatrix(fragment = fragments[[x]], features = features, 
       cells = cells, sep = sep, verbose = verbose, process_n = process_n)
5: FUN(X[[i]], ...)
4: lapply(X = X, FUN = FUN, ...)
3: sapply(X = obj.use, FUN = function(x) {
       SingleFeatureMatrix(fragment = fragments[[x]], features = features, 
           cells = cells, sep = sep, verbose = verbose, process_n = process_n)
   })
2: FeatureMatrix(fragments = Fragments(object = object[[assay]]), 
       features = transcripts, cells = colnames(x = object[[assay]]), 
       verbose = verbose, ...)
1: GeneActivity(pbmc.atac, features = VariableFeatures(pbmc.rna))

Any input to resolve this issue would be helpful. Thank you!

timoast commented 3 years ago

Can you confirm that your machine has access to the internet? You can run curl::has_internet() in R to check this.

Can you also post the output of sessionInfo()?

Chloe-Thangavelu commented 3 years ago

Sure, see below:

> curl::has_internet()
[1] TRUE
> sessionInfo()
R version 4.0.5 (2021-03-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] EnsDb.Hsapiens.v86_2.99.0     ensembldb_2.14.0              AnnotationFilter_1.14.0       GenomicFeatures_1.42.3       
 [5] AnnotationDbi_1.52.0          Biobase_2.50.0                GenomicRanges_1.42.0          GenomeInfoDb_1.26.7          
 [9] IRanges_2.24.1                S4Vectors_0.28.1              BiocGenerics_0.36.0           cowplot_1.1.1                
[13] ggplot2_3.3.3                 Signac_1.1.1                  pbmcMultiome.SeuratData_0.1.0 SeuratData_0.2.1             
[17] githubinstall_0.2.2           devtools_2.4.0                usethis_2.0.1                 SeuratObject_4.0.0           
[21] Seurat_4.0.1                 

loaded via a namespace (and not attached):
  [1] rappdirs_0.3.3              SnowballC_0.7.0             rtracklayer_1.49.5          scattermore_0.7            
  [5] GGally_2.1.1                tidyr_1.1.3                 bit64_4.0.5                 knitr_1.31                 
  [9] irlba_2.3.3                 DelayedArray_0.16.3         data.table_1.14.0           rpart_4.1-15               
 [13] RCurl_1.98-1.3              generics_0.1.0              callr_3.7.0                 RSQLite_2.2.5              
 [17] RANN_2.6.1                  future_1.21.0               bit_4.0.4                   spatstat.data_2.1-0        
 [21] xml2_1.3.2                  httpuv_1.5.5                SummarizedExperiment_1.20.0 assertthat_0.2.1           
 [25] xfun_0.22                   hms_1.0.0                   promises_1.2.0.1            fansi_0.4.2                
 [29] progress_1.2.2              dbplyr_2.1.1                igraph_1.2.6                DBI_1.1.1                  
 [33] htmlwidgets_1.5.3           reshape_0.8.8               spatstat.geom_2.0-1         purrr_0.3.4                
 [37] ellipsis_0.3.1              RSpectra_0.16-0             dplyr_1.0.5                 backports_1.2.1            
 [41] biomaRt_2.46.3              deldir_0.2-10               MatrixGenerics_1.2.1        vctrs_0.3.7                
 [45] remotes_2.3.0               ROCR_1.0-11                 abind_1.4-5                 cachem_1.0.4               
 [49] withr_2.4.1                 ggforce_0.3.3               BSgenome_1.58.0             checkmate_2.0.0            
 [53] sctransform_0.3.2           GenomicAlignments_1.26.0    prettyunits_1.1.1           goftest_1.2-2              
 [57] cluster_2.1.1               lazyeval_0.2.2              crayon_1.4.1                labeling_0.4.2             
 [61] pkgconfig_2.0.3             tweenr_1.0.2                nlme_3.1-152                pkgload_1.2.1              
 [65] ProtGenerics_1.22.0         nnet_7.3-15                 rlang_0.4.10                globals_0.14.0             
 [69] lifecycle_1.0.0             miniUI_0.1.1.1              BiocFileCache_1.14.0        dichromat_2.0-0            
 [73] rprojroot_2.0.2             polyclip_1.10-0             matrixStats_0.58.0          lmtest_0.9-38              
 [77] graph_1.68.0                Matrix_1.3-2                ggseqlogo_0.1               zoo_1.8-9                  
 [81] base64enc_0.1-3             ggridges_0.5.3              processx_3.5.1              png_0.1-7                  
 [85] viridisLite_0.3.0           bitops_1.0-6                KernSmooth_2.23-18          Biostrings_2.58.0          
 [89] blob_1.2.1                  stringr_1.4.0               parallelly_1.24.0           jpeg_0.1-8.1               
 [93] scales_1.1.1                memoise_2.0.0               magrittr_2.0.1              plyr_1.8.6                 
 [97] ica_1.0-2                   zlibbioc_1.36.0             compiler_4.0.5              RColorBrewer_1.1-2         
[101] fitdistrplus_1.1-3          Rsamtools_2.6.0             cli_2.4.0                   XVector_0.30.0             
[105] listenv_0.8.0               patchwork_1.1.1             pbapply_1.4-3               ps_1.6.0                   
[109] htmlTable_2.1.0             Formula_1.2-4               MASS_7.3-53.1               mgcv_1.8-34                
[113] tidyselect_1.1.0            stringi_1.5.3               askpass_1.1                 latticeExtra_0.6-29        
[117] ggrepel_0.9.1               grid_4.0.5                  VariantAnnotation_1.36.0    fastmatch_1.1-0            
[121] tools_4.0.5                 future.apply_1.7.0          rstudioapi_0.13             foreign_0.8-81             
[125] lsa_0.73.2                  gridExtra_2.3               farver_2.1.0                Rtsne_0.15                 
[129] digest_0.6.27               BiocManager_1.30.12         shiny_1.6.0                 Rcpp_1.0.6                 
[133] later_1.1.0.1               RcppAnnoy_0.0.18            OrganismDbi_1.32.0          httr_1.4.2                 
[137] ggbio_1.38.0                biovizBase_1.38.0           colorspace_2.0-0            XML_3.99-0.6               
[141] fs_1.5.0                    tensor_1.5                  reticulate_1.18             splines_4.0.5              
[145] uwot_0.1.10                 RBGL_1.66.0                 RcppRoll_0.3.0              spatstat.utils_2.1-0       
[149] plotly_4.9.3                sessioninfo_1.1.1           xtable_1.8-4                jsonlite_1.7.2             
[153] testthat_3.0.2              R6_2.5.0                    Hmisc_4.5-0                 pillar_1.5.1               
[157] htmltools_0.5.1.1           mime_0.10                   glue_1.4.2                  fastmap_1.1.0              
[161] BiocParallel_1.24.1         codetools_0.2-18            pkgbuild_1.2.0              utf8_1.2.1                 
[165] lattice_0.20-41             spatstat.sparse_2.0-0       tibble_3.1.0                curl_4.3                   
[169] leiden_0.3.7                openssl_1.4.3               survival_3.2-10             desc_1.3.0                 
[173] munsell_0.5.0               GenomeInfoDbData_1.2.4      reshape2_1.4.4              gtable_0.3.0               
[177] spatstat.core_2.0-0        
timoast commented 3 years ago

You need to run curl::has_internet() (call the actual function rather than print the function source code)

Chloe-Thangavelu commented 3 years ago

Yes, sorry I fixed that. It came out as true (see edited above)

timoast commented 3 years ago

Can you try running this code and see what is returned:

library(Rsamtools)

tbx <- TabixFile(file = "https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz", yieldSize = 10)
scanTabix(file = tbx)
Chloe-Thangavelu commented 3 years ago

That worked. This was returned and the tbx value was created.

> library(Rsamtools)
Loading required package: Biostrings
Loading required package: XVector

Attaching package: 'Biostrings'

The following object is masked from 'package:base':

    strsplit

> tbx <- TabixFile(file = "https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz", yieldSize = 10)
timoast commented 3 years ago

And the final line? scanTabix(file = tbx)

Chloe-Thangavelu commented 3 years ago

This is returned:

> scanTabix(file = tbx)
Error: scanTabix: failed to open file: https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz
 path: https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz
 index: https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz.tbi
timoast commented 3 years ago

Ok this seems like an issue with Rsamtools reading a remote file on Windows. I couldn't find anything in the documentation to say that this is not supported on Windows, so it could be a bug.

You should open an issue on the bioconductor support forum (https://support.bioconductor.org) and give this reproducible example:

library(Rsamtools)

tbx <- TabixFile(file = "https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz", yieldSize = 10)
scanTabix(file = tbx)

I'll close this issue for now since it's not an issue with Signac itself.

kumarage commented 1 year ago

@CthangavUCI Did you get this solved. Its more than a year. When I try to follow a Seurat Tutorial (https://satijalab.org/seurat/articles/atacseq_integration_vignette) I got the same issue. I checked for the specified file and it doesn't exist.