prabhakarlab / Banksy

BANKSY: spatial clustering
https://prabhakarlab.github.io/Banksy
Other
67 stars 12 forks source link

Strange staggering behavior #36

Closed jwangbio closed 2 months ago

jwangbio commented 2 months ago

Hello Banksy authors,

Congratulations on this work; I was really impressed by the results I have seen so far. I would like to work further with Banksy; however, in my investigations, I've found a strange staggering behavior. I am utilizing the RunBanksy function with the X/Y coordinates explicitly defined in a Seurat object, and because I am examining multiple slides together, I have a grouping variable set to 'Slide_ID', which should allow my slides to be separated out. However, I am noticing a behavior whereby the staggering is incorrect, and not all of the cells in the same slide are being lifted over. I manually created my staggered coordinates, which is not a problem, but I also noticed these diagonal stripes that appear, which I worry would impact any of the downstream analyses. I do not notice these stripes on other domain-finding packages I have used. Has this been noted before by your team I would appreciate any advice you might have on mitigating this.

I have a PNG attached, color-coded by the 'Slide_ID' of what I am seeing. Thank you! I am also on the newest Banksy version (v0.1.6) and haven't encountered any other errors.

Best, Jerry BANKSY_problem

vipulsinghal02 commented 2 months ago

Hi Jerry,

Would you be able to share a minimal reproducible example code (+ potentially synthetic data needed to reproduce the error)? We can then determine what is going wrong. For us, multi sample analysis has never had a problem like this, and from the image it looks like both the sample IDs and the sample cub coord offsets are getting mixed up.

Thanks! Vipul

jwangbio commented 2 months ago

Thanks for the quick response Vipul!

Please see the attached for a reproducible example. The BANKSY assay is already stored in the shared Seurat object. I have a sanitized version of the Seurat object accessible here: https://ucsf.box.com/s/on7kofxsrf8di1logtswxqy1sx1zlvbv

RunBanksy(synthetic, lambda = 0.2, assay = 'RNA', slot = 'counts',
                   dimx = 'x', dimy = 'y', features = 'all',
                   group = 'Section_ID', split.scale = TRUE, k_geom = 10)

SessionInfo()

R version 4.4.1 (2024-06-14) Platform: x86_64-pc-linux-gnu Running under: Ubuntu 22.04.3 LTS

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] Matrix_1.7-0 Banksy_0.1.6 tidyr_1.3.1 RANN_2.6.1
[5] dplyr_1.1.4 tibble_3.2.1 pals_1.8 gridExtra_2.3
[9] ggplot2_3.5.1 SeuratWrappers_0.3.5 ssHippo.SeuratData_3.1.4 SeuratData_0.2.2.9001
[13] Seurat_5.1.0 SeuratObject_5.0.2 sp_2.1-4

loaded via a namespace (and not attached): [1] RcppHungarian_0.3 RcppAnnoy_0.0.22 splines_4.4.1
[4] later_1.3.2 bitops_1.0-7 R.oo_1.26.0
[7] polyclip_1.10-6 fastDummies_1.7.3 lifecycle_1.0.4
[10] doParallel_1.0.17 globals_0.16.3 lattice_0.22-6
[13] MASS_7.3-60.2 magrittr_2.0.3 plotly_4.10.4
[16] remotes_2.5.0 httpuv_1.6.15 sctransform_0.4.1
[19] spam_2.10-0 spatstat.sparse_3.1-0 reticulate_1.38.0
[22] cowplot_1.1.3 mapproj_1.2.11 pbapply_1.7-2
[25] RColorBrewer_1.1-3 maps_3.4.2 abind_1.4-5
[28] zlibbioc_1.48.2 Rtsne_0.17 GenomicRanges_1.54.1
[31] purrr_1.0.2 R.utils_2.12.3 BiocGenerics_0.48.1
[34] RCurl_1.98-1.14 rappdirs_0.3.3 circlize_0.4.16
[37] GenomeInfoDbData_1.2.11 IRanges_2.36.0 S4Vectors_0.40.2
[40] ggrepel_0.9.5 irlba_2.3.5.1 listenv_0.9.1
[43] spatstat.utils_3.0-5 goftest_1.2-3 RSpectra_0.16-1
[46] spatstat.random_3.2-3 fitdistrplus_1.1-11 parallelly_1.37.1
[49] leiden_0.4.3.1 codetools_0.2-20 DelayedArray_0.28.0
[52] tidyselect_1.2.1 shape_1.4.6.1 farver_2.1.2
[55] matrixStats_1.3.0 stats4_4.4.1 spatstat.explore_3.2-7
[58] jsonlite_1.8.8 GetoptLong_1.0.5 progressr_0.14.0
[61] ggridges_0.5.6 ggalluvial_0.12.5 survival_3.6-4
[64] iterators_1.0.14 foreach_1.5.2 dbscan_1.2-0
[67] progress_1.2.3 tools_4.4.1 ica_1.0-3
[70] Rcpp_1.0.12 glue_1.7.0 SparseArray_1.2.4
[73] MatrixGenerics_1.16.0 GenomeInfoDb_1.38.8 withr_3.0.0
[76] BiocManager_1.30.23 fastmap_1.2.0 fansi_1.0.6
[79] digest_0.6.36 rsvd_1.0.5 R6_2.5.1
[82] mime_0.12 colorspace_2.1-0 scattermore_1.2
[85] sccore_1.0.5 tensor_1.5 dichromat_2.0-0.1
[88] spatstat.data_3.1-2 R.methodsS3_1.8.2 utf8_1.2.4
[91] generics_0.1.3 data.table_1.15.4 prettyunits_1.2.0
[94] httr_1.4.7 htmlwidgets_1.6.4 S4Arrays_1.2.1
[97] uwot_0.2.2 pkgconfig_2.0.3 gtable_0.3.5
[100] ComplexHeatmap_2.18.0 lmtest_0.9-40 XVector_0.42.0
[103] htmltools_0.5.8.1 dotCall64_1.1-1 clue_0.3-65
[106] scales_1.3.0 Biobase_2.62.0 png_0.1-8
[109] rstudioapi_0.16.0 reshape2_1.4.4 rjson_0.2.21
[112] nlme_3.1-164 zoo_1.8-12 GlobalOptions_0.1.2
[115] stringr_1.5.1 KernSmooth_2.23-24 parallel_4.4.1
[118] miniUI_0.1.1.1 pillar_1.9.0 grid_4.4.1
[121] vctrs_0.6.5 promises_1.3.0 xtable_1.8-4
[124] cluster_2.1.6 cli_3.6.3 compiler_4.4.1
[127] rlang_1.1.4 crayon_1.5.3 future.apply_1.11.2
[130] labeling_0.4.3 mclust_6.1.1 plyr_1.8.9
[133] stringi_1.8.4 viridisLite_0.4.2 deldir_2.0-4
[136] munsell_0.5.1 lazyeval_0.2.2 spatstat.geom_3.2-9
[139] RcppHNSW_0.6.0 hms_1.1.3 patchwork_1.2.0
[142] future_1.33.2 shiny_1.8.1.1 SummarizedExperiment_1.32.0 [145] ROCR_1.0-11 leidenAlg_1.1.3 igraph_2.0.3

Thank you for taking a look into it for me!

jleechung commented 2 months ago

Hi @jwangbio, thanks for flagging this issue. The bug is caused by samples in the SeuratObject not being ordered according to the grouping variable. I've pushed a fix for this: re-install our fork of SeuratWrappers with remotes::install_github('jleechung/seurat-wrappers@feat-aft') Once re-installed you should be able to run your code as-is.

Alternatively, if you want to use satijalab/seurat-wrappers, you can also just order your count matrix and metadata by the grouping variable before creating your SeuratObject. Something like this should work:

# assume metadata is a d.f. with your grouping variable as a column
ord = order(metadata$Section_ID)
counts = counts[, ord]
metadata = metadata[ord,]
seu = CreateSeuratObject(counts = counts, meta.data = metadata)
jwangbio commented 2 months ago

Hello @jleechung,

Thank you for help in diagnosing the issue. I re-ran it using your newest fork and ran into this bug.

Error in get_locs(object, dimx, dimy, dimz, ndim, data_own, group, verbose) : 
  object 'seu' not found

I reviewed your latest push and I think you might need to swap out the seu with object. I just renamed my object to be seu to get it to work and the staggering issue is now corrected. Thank you!

jleechung commented 2 months ago

Thanks again @jwangbio should be fixed now!