GreenleafLab / ArchR

ArchR : Analysis of Regulatory Chromatin in R (www.ArchRProject.com)
MIT License
384 stars 137 forks source link

getValidBarcodes does not work with cellranger-atac version 2.0.0 #1224

Closed artgolden closed 2 years ago

artgolden commented 2 years ago

Describe the bug Error in FUN(X[[i]], ...) : cell_id not in colnames of 10X singlecell.csv file! Are you sure inut is correct? Cellranger-atac in version 2.0.0 has changed the format of theirsinglecell.csv files and there is no cell_id column anymore. Instead there is a boolean column is__cell_barcode. singlecell_example.csv

To Reproduce Run getValidBarcodes() function with 10X singlecell.csv generated with Cellranger-atac version >2.0.0.

Expected behavior Correctly load cell-barcodes that have passed cellranger-atac filteres.

Session Info R version 4.1.1 (2021-08-10) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.2 LTS

Matrix products: default BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] BSgenome.Mmusculus.UCSC.mm10_1.4.3 BSgenome_1.62.0
[3] rtracklayer_1.54.0 Biostrings_2.62.0
[5] XVector_0.34.0 forcats_0.5.1
[7] stringr_1.4.0 dplyr_1.0.7
[9] purrr_0.3.4 readr_2.1.1
[11] tidyr_1.1.4 tibble_3.1.6
[13] tidyverse_1.3.1 patchwork_1.1.1
[15] ArchR_1.0.2 magrittr_2.0.1
[17] rhdf5_2.38.0 Matrix_1.3-4
[19] data.table_1.14.2 SummarizedExperiment_1.24.0
[21] Biobase_2.54.0 GenomicRanges_1.46.1
[23] GenomeInfoDb_1.30.0 IRanges_2.28.0
[25] S4Vectors_0.32.3 BiocGenerics_0.40.0
[27] MatrixGenerics_1.6.0 matrixStats_0.61.0
[29] ggplot2_3.3.5

loaded via a namespace (and not attached): [1] bitops_1.0-7 fs_1.5.1 bit64_4.0.5
[4] lubridate_1.7.10 httr_1.4.2 rprojroot_2.0.2
[7] tools_4.1.1 backports_1.2.1 bslib_0.3.1
[10] utf8_1.2.2 R6_2.5.1 DBI_1.1.1
[13] colorspace_2.0-2 rhdf5filters_1.6.0 withr_2.4.3
[16] tidyselect_1.1.1 bit_4.0.4 compiler_4.1.1
[19] cli_3.1.0 rvest_1.0.1 Cairo_1.5-12.2
[22] xml2_1.3.2 DelayedArray_0.20.0 sass_0.4.0
[25] scales_1.1.1 Rsamtools_2.10.0 digest_0.6.29
[28] rmarkdown_2.10 pkgconfig_2.0.3 htmltools_0.5.2
[31] dbplyr_2.1.1 fastmap_1.1.0 rlang_0.4.12
[34] readxl_1.3.1 rstudioapi_0.13 BiocIO_1.4.0
[37] jquerylib_0.1.4 generics_0.1.1 jsonlite_1.7.2
[40] vroom_1.5.7 BiocParallel_1.28.2 RCurl_1.98-1.5
[43] GenomeInfoDbData_1.2.7 Rcpp_1.0.7 munsell_0.5.0
[46] Rhdf5lib_1.16.0 fansi_0.5.0 lifecycle_1.0.1
[49] stringi_1.7.6 yaml_2.2.1 zlibbioc_1.40.0
[52] grid_4.1.1 crayon_1.4.2 lattice_0.20-44
[55] haven_2.4.3 hms_1.1.1 knitr_1.33
[58] pillar_1.6.4 rjson_0.2.20 XML_3.99-0.8
[61] reprex_2.0.1 glue_1.5.1 evaluate_0.14
[64] modelr_0.1.8 vctrs_0.3.8 tzdb_0.2.0
[67] cellranger_1.1.0 gtable_0.3.0 assertthat_0.2.1
[70] xfun_0.25 broom_0.7.9 restfulr_0.0.13
[73] GenomicAlignments_1.30.0 ellipsis_0.3.2

rcorces commented 2 years ago

Hi @artgolden! Thanks for using ArchR! I am currently on paternity leave and will not be responding to any issues or discussion threads. I plan to be back in late January and will do my best to address your issue then.
In the meantime, it is worth noting that there are very few actual bugs in ArchR. If you are getting an error, it is probably something specific to your dataset, usage, or computational environment. Search the previous Issues, Discussions, function definitions, or the ArchR manual and you will likely find the answers you are looking for.
If you are able to solve your issue, please post the solution and close this issue post.
Otherwise if you would like my help when I return, you must respond to the following questions unless your original post already contained this information: 1. If you've encountered an error, have you already searched previous Issues to make sure that this hasn't already been solved? 2. Can you recapitulate your error using the tutorial code and dataset? If so, provide a reproducible example. 3. Did you post your log file? If not, add it now.

yaxiliu-1996 commented 2 years ago

I also have an issue on createArrowFiles. My input files are not originated from custom scATAC fragments files. I generated similar format of fragments files from my single-cell data which have the same columns as described in the cell-ranger website. I have 7886 single cells (barcodes), but the ArrowFiles and ArchRProject only recovered 272 single cells. When I give the validBarcodes parameter, I encountered the error: "createArrowFiles has encountered an error, checking if any ArrowFiles completed". The log file had no specific description of error.

rcorces commented 2 years ago

@yaxiliu-1996 - Please do not post unrelated issues on an existing issue thread. Additionally, since you have not provided any information that would help me help you, I cannot provide a response.

rcorces commented 2 years ago

This issue has now been addressed on dev and is slated for incorporation into release_1.0.3 shortly. via https://github.com/GreenleafLab/ArchR/commit/c5b111dfa68c4b74f1b5b4a0dfaafc5981162a51

cnluzon commented 10 months ago

Hi, just a comment that this has not yet been released, right? I have ArchR 1.0.2 and I am currently experiencing the exact same issue as the OP.