Open hchoiHiLung opened 5 months ago
Dear all,
Thanks in advance for the support.
I got almost the same error attempting to load the lymph node dataset (xenium 10x; https://www.10xgenomics.com/datasets/preview-data-xenium-prime-gene-expression):
A structured Xenium directory will be used Checking directory contents...
analysis info found └──analysis.tar.gz └──analysis.zarr.zip └──analysis_summary.html boundary info found └──cell_boundaries.csv.gz └──cell_boundaries.parquet └──nucleus_boundaries.csv.gz └──nucleus_boundaries.parquet cell feature matrix found └──cell_feature_matrix └──cell_feature_matrix.h5 └──cell_feature_matrix.zarr.zip cell metadata found └──cells.csv.gz └──cells.parquet └──cells.zarr.zip image info found └──morphology.ome.tif panel metadata found └──gene_panel.json raw transcript info found └──transcripts.parquet └──transcripts.zarr.zip experiment info (.xenium) found └──experiment.xenium Directory check done Loading feature metadata... Loading transcript level info... Error in path_list$tx_path[[1]] : subscript out of bounds
Do you have any suggestions to overcome this error?
Best,
Domenico
sessionInfo() R version 4.2.2 (2022-10-31 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 22631)
Matrix products: default
locale: [1] LC_COLLATE=Italian_Italy.utf8 LC_CTYPE=Italian_Italy.utf8 LC_MONETARY=Italian_Italy.utf8 LC_NUMERIC=C LC_TIME=Italian_Italy.utf8
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] future_1.33.2 reticulate_1.35.0 Giotto_4.0.5 GiottoClass_0.2.3
loaded via a namespace (and not attached):
[1] Rcpp_1.0.11 locfit_1.5-9.8 lattice_0.20-45 listenv_0.9.1 png_0.1-8 gtools_3.9.5 digest_0.6.35 SingleCellExperiment_1.20.1
[9] utf8_1.2.4 parallelly_1.37.1 R6_2.5.1 GenomeInfoDb_1.34.9 backports_1.4.1 stats4_4.2.2 ggplot2_3.5.0 pillar_1.9.0
[17] sparseMatrixStats_1.10.0 GiottoVisuals_0.1.6 zlibbioc_1.44.0 rlang_1.1.2 rstudioapi_0.16.0 data.table_1.15.4 magick_2.8.3 S4Vectors_0.36.2
[25] R.utils_2.12.3 R.oo_1.26.0 Matrix_1.6-5 checkmate_2.3.1 BiocParallel_1.32.6 RCurl_1.98-1.14 munsell_0.5.1 beachmat_2.14.2
[33] DelayedArray_0.24.0 HDF5Array_1.26.0 compiler_4.2.2 DropletUtils_1.18.1 pkgconfig_2.0.3 BiocGenerics_0.44.0 globals_0.16.3 tidyselect_1.2.1
[41] SummarizedExperiment_1.28.0 tibble_3.2.1 GenomeInfoDbData_1.2.9 edgeR_3.40.2 IRanges_2.32.0 codetools_0.2-18 matrixStats_1.1.0 fansi_1.0.6
[49] withr_3.0.0 dplyr_1.1.4 bitops_1.0-7 rhdf5filters_1.10.1 R.methodsS3_1.8.2 grid_4.2.2 jsonlite_1.8.8 gtable_0.3.5
[57] lifecycle_1.0.4 magrittr_2.0.3 scales_1.3.0 dqrng_0.3.2 cli_3.6.2 scuttle_1.8.4 XVector_0.38.0 SpatialExperiment_1.8.1
[65] limma_3.54.2 generics_0.1.3 DelayedMatrixStats_1.20.0 vctrs_0.6.5 colorRamp2_0.1.0 Rhdf5lib_1.20.0 rjson_0.2.21 tools_4.2.2
[73] Biobase_2.58.0 glue_1.6.2 purrr_1.0.2 GiottoUtils_0.1.6 MatrixGenerics_1.10.0 parallel_4.2.2 colorspace_2.1-0 rhdf5_2.42.1
For all the users who encounter this error, I have found the problem. This was my mistake and not related to Giotto suite. The file 'transcripts.csv.gz' was missing from the 10x folder, and only the 'transcripts.parquet' file was present, which caused the error.
However, you can create this file using the following code:
###################################################
###################################################
library(arrow)
PATH <- 'add your path'
OUTPUT <- gsub('\.parquet$', '.csv', PATH)
CHUNK_SIZE <- 1e6
parquet_file <- arrow::read_parquet(PATH, as_data_frame = FALSE) start <- 0
while(start < parquet_file$num_rows) { end <- min(start + CHUNK_SIZE, parquet_file$num_rows) chunk <- as.data.frame(parquet_file$Slice(start, end - start)) data.table::fwrite(chunk, OUTPUT, append = start != 0) start <- end }
if(require('R.utils', quietly = TRUE)) { R.utils::gzip(OUTPUT) } ################################################################ ###############################################################
Best,
Domenico
Describe the Error
Whenever
data_to_use
is set to "aggregate",createGiottoXeniumObject
returns the following error. The breast cancer dataset in the vignette doesn't work either. "subcellular" works without problems.(a side note:
data_to_use
does not accept "all" as specified in the documentation.)Error Message
System Information