Open fe4960 opened 1 year ago
Hi @fe4960! Thanks for using ArchR! Please make sure that your post belongs in the Issues section. Only bugs and error reports belong in the Issues section. Usage questions and feature requests should be posted in the Discussions section, not in Issues.
Before we help you, you must respond to the following questions unless your original post already contained this information:
1. If you've encountered an error, have you already searched previous Issues to make sure that this hasn't already been solved?
2. Can you recapitulate your error using the tutorial code and dataset? If so, provide a reproducible example.
3. Did you post your log file? If not, add it now.
4. Remove any screenshots that contain text and instead copy and paste the text using markdown's codeblock syntax (three consecutive backticks). You can do this by editing your original post.
Not all requested information has been supplied. Closing due to inactivity.
sorry, my above comment was meant for a different post. please ignore. will reply shortly
Its hard for me to diagnose exactly why this is happening without the ability to reproduce the error. Is it possible for you to provide a reproducible example?
Its possible that there is a bug in the code where if one of your Arrow files doesnt have any cells represented in the cells
param, then this error might happen. Could you show me the breakdown of your arrow files and the number of cells represented in each one?
Thanks a lot for the reply. Could you let me know the command to show the number of cells in the arrow files? Thanks!
sorry for the delay. I would use table()
. For example:
table(ArchRProj@cellColData$Sample)
You'll also need to show the number of cells per sample within the subset of cells you are looking at. For example:
table(ArchRProj@cellColData$Sample[which(getCellNames(ArchRProj) %in% final_cell)])
Thanks for the reply!
The cell number of each sample in the original ArchR object:
> table(proj1@cellColData$Sample)
19_D003_lobe 19_D003_macular 19_D003_macular_NeuN_1
103 152 40
19_D003_macular_NeuN_2 19_D003_macular_NeuN_3 19_D005_lobe
46 56 97
19_D005_macular 19_D006_lobe 19_D006_macular
333 90 396
19_D006_macular_NeuN_1 19_D006_macular_NeuN_2 19_D006_macular_NeuN_3
5 6 12
19_D007_lobe 19_D007_macular 19_D008_lobe
139 176 144
19_D008_macular 19_D009_lobe 19_D009_macular
148 161 213
19_D010_lobe 19_D010_macular 19_D010_macular_NeuN_1
38 263 23
19_D010_macular_NeuN_2 19_D010_macular_NeuN_3 19_D011_lobe
13 14 82
19_D011_macular 19_D019_lobe 19_D019_macular
479 129 227
19_D019_macular_NeuN_1 19_D019_macular_NeuN_2 19_D019_macular_NeuN_3
3 2 1
19D013_fovea 19D013_macular 19D014_fovea
188 547 388
19D016_macular D005_13_lobe D005_13_macular
691 72 123
D009_13_lobe D009_13_macular D013_13_lobe
39 232 136
D013_13_macular D017_13_lobe D017_13_macular
357 57 180
D018_13_lobe D018_13_macular D019_13_lobe
100 248 125
D019_13_macular D021_13_lobe D021_13_macular
511 59 258
D026_13_lobe D026_13_macular D027_13_lobe
90 154 83
D027_13_macular D028_13_lobe D028_13_macular
262 112 175
D028_13_macular_NeuN_1 D028_13_macular_NeuN_2 D030_13_lobe
1 1 89
D030_13_macular GSM5567523_Hu5 GSM5567524_Hu7
41 128 125
GSM5567533_Hu8
180
The number of cells per sample within the subset of cells:
> table(proj1@cellColData$Sample[which(getCellNames(proj1) %in% final_cell)])
19_D003_lobe 19_D003_macular 19_D003_macular_NeuN_1
103 152 40
19_D003_macular_NeuN_2 19_D003_macular_NeuN_3 19_D005_lobe
46 56 97
19_D005_macular 19_D006_lobe 19_D006_macular
332 90 396
19_D006_macular_NeuN_1 19_D006_macular_NeuN_2 19_D006_macular_NeuN_3
5 6 12
19_D007_lobe 19_D007_macular 19_D008_lobe
139 176 144
19_D008_macular 19_D009_lobe 19_D009_macular
148 161 213
19_D010_lobe 19_D010_macular 19_D010_macular_NeuN_1
38 262 23
19_D010_macular_NeuN_2 19_D010_macular_NeuN_3 19_D011_lobe
13 14 82
19_D011_macular 19_D019_lobe 19_D019_macular
479 129 227
19_D019_macular_NeuN_1 19_D019_macular_NeuN_2 19D013_fovea
3 2 188
19D013_macular 19D014_fovea 19D016_macular
547 388 691
D005_13_lobe D005_13_macular D009_13_lobe
72 123 39
D009_13_macular D013_13_lobe D013_13_macular
231 136 357
D017_13_lobe D017_13_macular D018_13_lobe
57 180 100
D018_13_macular D019_13_lobe D019_13_macular
223 125 504
D021_13_lobe D021_13_macular D026_13_lobe
58 258 90
D026_13_macular D027_13_lobe D027_13_macular
153 83 262
D028_13_lobe D028_13_macular D030_13_lobe
112 174 89
D030_13_macular GSM5567523_Hu5 GSM5567524_Hu7
40 128 125
GSM5567533_Hu8
180
The file size of 19_D010_macular_NeuN_1.arrow is 37015914, much smaller compared to other arrow files. I don't know if it is the file causing error.
This does not have to do with what I suggested previously. subsetArchRProject()
behaves as expected when you subset and have Arrow files that lack cells etc. Thus far, I'm unable to recapitulate this error on the tutorial data.
I've taken my best guess at a solution, based solely on your error message. That change has been implemented on the dev_idxKeep
branch. Please test this out by installing that branch as indicated below. And please report back on the outcome.
devtools::install_github("GreenleafLab/ArchR", ref="dev_idxKeep", repos = BiocManager::repositories(), upgrade = "never")
#to unload a package and reload
detach("package:ArchR", unload=TRUE)
library(ArchR)
Thanks for the help. I installed the "dev_idxKeep" branch. It still shows error.
proj2=subsetArchRProject(
- ArchRProj = proj1,
- cells = final_cell,
- outputDirectory = dir,
- dropCells = TRUE) Copying ArchRProject to new outputDirectory : /storage/chenlab/Users/junwang/human_meta/data/proj4_clean1_HC_final Copying Arrow Files... Error in .safelapply(seq_along(inArrows), function(x) { : Error Found Iteration 1 : [1] "Error in .h5read(inArrow, h5name)[idxKeep, ] : \n incorrect number of dimensions\n" <simpleError in .h5read(inArrow, h5name)[idxKeep, ]: incorrect number of dimensions> Error Found Iteration 2 : [1] "Error in .h5read(inArrow, h5name)[idxKeep, ] : \n incorrect number of dimensions\n" <simpleError in .h5read(inArrow, h5name)[idxKeep, ]: incorrect number of dimensions> Error Found Iteration 3 : [1] "Error in .h5read(inArrow, h5name)[idxKeep, ] : \n incorrect number of dimensions\n" <simpleError in .h5read(inArrow, h5name)[idxKeep, ]: incorrect number of dimensions> Error Found Iteration 4 : [1] "Error in .h5read(inArrow, h5name)[idxKeep, ] : \n incorrect number of dimensions\n" <simpleError in .h5read(inArrow, h5name)[idxKeep, ]: incorrect number of dimensions> Error Found Iteration 5 : [1] "Error in .h5read(inArrow, h5name)[idxKeep, ] : \n incorrect number of dimensions\n" <simpleError in .h5read(inArrow, h5name)[idxKeep, ]: in In addition: Warning message: In mclapply(..., mc.cores = threads, mc.preschedule = preschedule) : 58 function calls resulted in an error
Sorry, maybe I got the column/row order incorrect. I made another change. could you re-install dev_idxKeep
and try one more time?
devtools::install_github("GreenleafLab/ArchR", ref="dev_idxKeep", repos = BiocManager::repositories(), upgrade = "never")
#to unload a package and reload
detach("package:ArchR", unload=TRUE)
library(ArchR)
I have re-installed dev_idxKeep batch. It shows the error info below. Could you help fix it? Thanks a lot!
proj2=subsetArchRProject( ArchRProj = proj1, cells = final_cell, outputDirectory = dir, dropCells = TRUE) Copying ArchRProject to new outputDirectory : /storage/chenlab/Users/junwang/human_meta/data/proj4_clean1_HC_final Copying Arrow Files... Error in .safelapply(seq_along(inArrows), function(x) { : Error Found Iteration 1 : [1] "Error in .h5read(inArrow, h5name)[, idxKeep] : \n incorrect number of dimensions\n" <simpleError in .h5read(inArrow, h5name)[, idxKeep]: incorrect number of dimensions> Error Found Iteration 2 : [1] "Error in .h5read(inArrow, h5name)[, idxKeep] : \n incorrect number of dimensions\n" <simpleError in .h5read(inArrow, h5name)[, idxKeep]: incorrect number of dimensions> Error Found Iteration 3 : [1] "Error in .h5read(inArrow, h5name)[, idxKeep] : \n incorrect number of dimensions\n" <simpleError in .h5read(inArrow, h5name)[, idxKeep]: incorrect number of dimensions> Error Found Iteration 4 : [1] "Error in .h5read(inArrow, h5name)[, idxKeep] : \n incorrect number of dimensions\n" <simpleError in .h5read(inArrow, h5name)[, idxKeep]: incorrect number of dimensions> Error Found Iteration 5 : [1] "Error in .h5read(inArrow, h5name)[, idxKeep] : \n incorrect number of dimensions\n" <simpleError in .h5read(inArrow, h5name)[, idxKeep]: in In addition: Warning message: In mclapply(..., mc.cores = threads, mc.preschedule = preschedule) : 58 function calls resulted in an error
As my blind attempts to fix this have failed, I dont have much more to offer at this time. I think you will need to manually step through the code and figure out what the value is for idxKeep
during the iteration where it is failing.
Hello,
Attach your log file This step doesn't generate a log file.
Describe the bug I tried to subset a ArchR project with the code below:
proj2=subsetArchRProject( ArchRProj = proj1, cells = final_cell, outputDirectory = dir, dropCells = TRUE)
It generated the error below and didn't went through:
Copying ArchRProject to new outputDirectory : human_meta/data/proj4_clean1_final Copying Arrow Files... Error in .safelapply(seq_along(inArrows), function(x) { : Error Found Iteration 51 : [1] "Error in
[.data.frame
(.h5read(inArrow, h5name), idxKeep) : \n undefined columns selected\n" <simpleError in[.data.frame
(.h5read(inArrow, h5name), idxKeep): undefined columns selected> Calls: subsetArchRProject ... saveArchRProject -> .copyArrows -> unlist -> .safelapply In addition: Warning message: In mclapply(..., mc.cores = threads, mc.preschedule = preschedule) : 1 function calls resulted in an error Execution haltedI used the latest version of ArchR v1.0.3, as v1.0.2 generated other error. I used the same code with v1.0.3 to run through other datasets and worked well. I wonder what causes the error in this dataset and if you can help fix it. Thanks a lot!
I searched the previous issues and found this error has not been solved before.
Session Info R version 4.1.0 (2021-05-18)
locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages: [1] stats4 grid stats graphics grDevices utils datasets [8] methods base
other attached packages: [1] rhdf5_2.38.1 SummarizedExperiment_1.24.0 [3] Biobase_2.54.0 RcppArmadillo_0.11.0.0.0
[5] Rcpp_1.0.9 Matrix_1.5-3
[7] GenomicRanges_1.46.1 GenomeInfoDb_1.30.1
[9] IRanges_2.28.0 S4Vectors_0.32.4
[11] BiocGenerics_0.40.0 sparseMatrixStats_1.6.0
[13] MatrixGenerics_1.6.0 matrixStats_0.63.0
[15] data.table_1.14.6 stringr_1.5.0
[17] plyr_1.8.8 magrittr_2.0.3
[19] ggplot2_3.4.0 gtable_0.3.1
[21] gtools_3.9.4 gridExtra_2.3
[23] devtools_2.4.4 usethis_2.1.6
[25] ArchR_1.0.3
loaded via a namespace (and not attached): [1] pkgload_1.3.0 shiny_1.7.1 assertthat_0.2.1
[4] GenomeInfoDbData_1.2.7 remotes_2.4.2 sessioninfo_1.2.2
[7] pillar_1.7.0 lattice_0.20-45 glue_1.6.2
[10] digest_0.6.30 promises_1.2.0.1 XVector_0.34.0
[13] colorspace_2.0-3 htmltools_0.5.2 httpuv_1.6.5
[16] pkgconfig_2.0.3 zlibbioc_1.40.0 purrr_0.3.4
[19] xtable_1.8-4 scales_1.2.1 processx_3.5.3
[22] later_1.3.0 tibble_3.1.6 generics_0.1.2
[25] ellipsis_0.3.2 cachem_1.0.6 withr_2.5.0
[28] cli_3.5.0 crayon_1.5.2 mime_0.12
[31] memoise_2.0.1 ps_1.6.0 fs_1.5.2
[34] fansi_1.0.3 pkgbuild_1.3.1 profvis_0.3.7
[37] tools_4.1.0 prettyunits_1.1.1 lifecycle_1.0.3
[40] Rhdf5lib_1.16.0 munsell_0.5.0 DelayedArray_0.20.0
[43] callr_3.7.0 compiler_4.1.0 rlang_1.0.6
[46] RCurl_1.98-1.9 rhdf5filters_1.6.0 htmlwidgets_1.5.4
[49] miniUI_0.1.1.1 bitops_1.0-7 DBI_1.1.3
[52] R6_2.5.1 dplyr_1.0.8 fastmap_1.1.0
[55] utf8_1.2.2 stringi_1.7.8 parallel_4.1.0
[58] vctrs_0.5.1 tidyselect_1.1.2 urlchecker_1.0.1