Strange staggering behavior #36

Closed jwangbio closed 2 months ago

jwangbio commented 2 months ago

Hello Banksy authors,

Congratulations on this work; I was really impressed by the results I have seen so far. I would like to work further with Banksy; however, in my investigations, I've found a strange staggering behavior. I am utilizing the RunBanksy function with the X/Y coordinates explicitly defined in a Seurat object, and because I am examining multiple slides together, I have a grouping variable set to 'Slide_ID', which should allow my slides to be separated out. However, I am noticing a behavior whereby the staggering is incorrect, and not all of the cells in the same slide are being lifted over. I manually created my staggered coordinates, which is not a problem, but I also noticed these diagonal stripes that appear, which I worry would impact any of the downstream analyses. I do not notice these stripes on other domain-finding packages I have used. Has this been noted before by your team I would appreciate any advice you might have on mitigating this.

I have a PNG attached, color-coded by the 'Slide_ID' of what I am seeing. Thank you! I am also on the newest Banksy version (v0.1.6) and haven't encountered any other errors.

Best, Jerry BANKSY_problem

vipulsinghal02 commented 2 months ago

Hi Jerry,

Would you be able to share a minimal reproducible example code (+ potentially synthetic data needed to reproduce the error)? We can then determine what is going wrong. For us, multi sample analysis has never had a problem like this, and from the image it looks like both the sample IDs and the sample cub coord offsets are getting mixed up.

Thanks! Vipul

jwangbio commented 2 months ago

Thanks for the quick response Vipul!

Please see the attached for a reproducible example. The BANKSY assay is already stored in the shared Seurat object. I have a sanitized version of the Seurat object accessible here:

RunBanksy(synthetic, lambda = 0.2, assay = 'RNA', slot = 'counts',
                   dimx = 'x', dimy = 'y', features = 'all',
                   group = 'Section_ID', split.scale = TRUE, k_geom = 10)


Thank you for taking a look into it for me!

jleechung commented 2 months ago

Hi @jwangbio, thanks for flagging this issue. The bug is caused by samples in the SeuratObject not being ordered according to the grouping variable. I've pushed a fix for this: re-install our fork of SeuratWrappers with remotes::install_github('jleechung/seurat-wrappers@feat-aft') Once re-installed you should be able to run your code as-is.

Alternatively, if you want to use satijalab/seurat-wrappers, you can also just order your count matrix and metadata by the grouping variable before creating your SeuratObject. Something like this should work:

# assume metadata is a d.f. with your grouping variable as a column
ord = order(metadata$Section_ID)
counts = counts[, ord]
metadata = metadata[ord,]
seu = CreateSeuratObject(counts = counts, = metadata)
jwangbio commented 2 months ago

Hello @jleechung,

Thank you for help in diagnosing the issue. I re-ran it using your newest fork and ran into this bug.

Error in get_locs(object, dimx, dimy, dimz, ndim, data_own, group, verbose) : 
  object 'seu' not found

I reviewed your latest push and I think you might need to swap out the seu with object. I just renamed my object to be seu to get it to work and the staggering issue is now corrected. Thank you!

jleechung commented 2 months ago

Thanks again @jwangbio should be fixed now!