edward130603 / BayesSpace

Bayesian model for clustering and enhancing the resolution of spatial gene expression experiments.
http://edward130603.github.io/BayesSpace
Other
96 stars 20 forks source link

Backward-incompatible changes introduced by spaceranger v2 #122

Closed AmelZulji closed 2 months ago

AmelZulji commented 2 months ago

Problem

  1. loading visium fails
  2. coordinate mismatching occasionally occurs

Reprex

suppressPackageStartupMessages({
  library(stringr)
  library(purrr)
  library(BayesSpace)
})
#> Warning: package 'GenomeInfoDb' was built under R version 4.3.3

Download data

dir.create("data")
file_path <- c(
  "https://cf.10xgenomics.com/samples/spatial-exp/2.0.0/CytAssist_11mm_FFPE_Human_Ovarian_Carcinoma/CytAssist_11mm_FFPE_Human_Ovarian_Carcinoma_spatial.tar.gz",
  "https://cf.10xgenomics.com/samples/spatial-exp/2.0.0/CytAssist_11mm_FFPE_Human_Ovarian_Carcinoma/CytAssist_11mm_FFPE_Human_Ovarian_Carcinoma_filtered_feature_bc_matrix.h5",
  "https://cf.10xgenomics.com/samples/spatial-exp/2.0.0/CytAssist_11mm_FFPE_Human_Ovarian_Carcinoma/CytAssist_11mm_FFPE_Human_Ovarian_Carcinoma_filtered_feature_bc_matrix.tar.gz"
)

file_name <- str_extract(file_path, pattern = "(?<=Carcinoma_).*")
walk2(file_path, file_name, \(x,y) download.file(url = x, destfile = paste0("data/", y)))

untar("data/filtered_feature_bc_matrix.tar.gz", exdir = "data/")
untar("data/spatial.tar.gz", exdir = "data/")

try to read the visium data

readVisium("data/")
#> Warning in file(file, "rt"): cannot open file
#> 'data//spatial/tissue_positions_list.csv': No such file or directory
#> Error in file(file, "rt"): cannot open the connection

it seems like the problem is tissue_positions_list.csv which is renamed to tissue_positions.csv

fs::dir_tree(path = "data/spatial/")
#> data/spatial/
#> ├── aligned_fiducials.jpg
#> ├── aligned_tissue_image.jpg
#> ├── cytassist_image.tiff
#> ├── detected_tissue_image.jpg
#> ├── scalefactors_json.json
#> ├── spatial_enrichment.csv
#> ├── tissue_hires_image.png
#> ├── tissue_lowres_image.png
#> └── tissue_positions.csv

However the problem persists even after renaming the problematic file, now with different error (perhaps column names were renamed as well, I dont know how was it in previous versions...)

file.rename("data/spatial/tissue_positions.csv", to = "data/spatial/tissue_positions_list.csv")
#> [1] TRUE
fs::dir_tree(path = "data/spatial/")
#> data/spatial/
#> ├── aligned_fiducials.jpg
#> ├── aligned_tissue_image.jpg
#> ├── cytassist_image.tiff
#> ├── detected_tissue_image.jpg
#> ├── scalefactors_json.json
#> ├── spatial_enrichment.csv
#> ├── tissue_hires_image.png
#> ├── tissue_lowres_image.png
#> └── tissue_positions_list.csv
readVisium("data/")
#> Error in .subscript.2ary(x, , j, drop = TRUE): subscript out of bounds

One way to overcome the problem is to manually load the files or as workaround use seurat (which takes care of loading the data) and from there construct the SCE object

suppressPackageStartupMessages({
  library(Seurat)
  library(hdf5r)
})

seu <- Load10X_Spatial("data/")

# used loaded data to construct SCE object
sce <- SingleCellExperiment(
  assays = list(counts = as(GetAssayData(seu, assay = "Spatial", slot = "counts"),"dgCMatrix")),
  colData = seu@images$slice1@coordinates,
  rowData = data.frame(row.names = rownames(seu), gene_name = rownames(seu))
  )
#> Warning: The `slot` argument of `GetAssayData()` is deprecated as of SeuratObject 5.0.0.
#> ℹ Please use the `layer` argument instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.

However, sometimes the problem described here occurs https://github.com/edward130603/BayesSpace/issues/77 (not in this sample though)

plot(sce$row, sce$imagerow)

plot(sce$col, sce$imagecol)

is there a programatic way to check if the missmatching is occuring? Currently i use following way but was wondering if that can be cought somewhere upstream …

if (!cor(sce$row, sce$imagerow) > 0.98) {
  img_row <- sce$imagerow
  img_col <- sce$imagecol
  sce$imagerow <- img_col
  sce$imagecol <- img_row
}
#> Error in cor(sce$row, sce$imagerow): 'x' must be numeric

Created on 2024-04-16 with reprex v2.1.0

edward130603 commented 2 months ago

Can you try the version on github? I believe these issues are fixed here.

devtools::install_github("edward130603/BayesSpace")

Will be pushing these changes to Bioconductor in the near future.

AmelZulji commented 2 months ago

Thank you, Edward!

I can confirm that readVisium() works as expected.

Regards, Amel

AmelZulji commented 2 months ago

Just to add: by default spaceranger v2 runs with --reorient-images flag. If the image is not in the orientation spaceranger expects (hourglass fiducial on top left + possible mirror orientation), spaceranger will reorient the image and will possibly cause coordinate missmatching as mentioned above.