Closed lcolladotor closed 1 year ago
Note that it's not as simple as changing
xyz <- file.path(rep(dir, each = length(sids)),
sprintf("tissue_positions%s.csv",
rep(c("", "_list"), length(sids))))
to
xyz <- file.path(dir,
sprintf("tissue_positions%s.csv",
rep(c("", "_list"), length(sids))))
(creates a vector of length 60 prior to file.exists()
; aka number of samples * 2)
which, although it has the right length, it's still on a different order:
> xyz <- xyz[file.exists(xyz)]
> options(width = 300)
> data.frame(basename(gsub("/outs/spatial/tissue_positions_list.csv", "", xyz)), names(samples))
basename.gsub...outs.spatial.tissue_positions_list.csv....... names.samples.
1 DLPFC_Br2743_mid_manual_alignment_extra_reads DLPFC_Br2743_ant_manual_alignment
2 DLPFC_Br3942_ant_manual_alignment DLPFC_Br2743_mid_manual_alignment_extra_reads
3 DLPFC_Br3942_post_manual_alignment DLPFC_Br2743_post_manual_alignment
4 DLPFC_Br6423_mid_manual_alignment DLPFC_Br3942_ant_manual_alignment
5 DLPFC_Br8492_ant_manual_alignment DLPFC_Br3942_mid_manual_alignment
6 DLPFC_Br8492_post_manual_alignment DLPFC_Br3942_post_manual_alignment
7 DLPFC_Br2720_mid_manual_alignment DLPFC_Br6423_ant_manual_alignment_extra_reads
8 DLPFC_Br6432_ant_2 DLPFC_Br6423_mid_manual_alignment
9 DLPFC_Br6432_post_manual_alignment DLPFC_Br6423_post_extra_reads
10 DLPFC_Br6471_mid_manual_alignment_all DLPFC_Br8492_ant_manual_alignment
11 DLPFC_Br6522_ant_manual_alignment_all DLPFC_Br8492_mid_manual_alignment_extra_reads
12 DLPFC_Br6522_post_manual_alignment_all DLPFC_Br8492_post_manual_alignment
13 DLPFC_Br8325_mid_2 DLPFC_Br2720_ant_2
14 DLPFC_Br8667_ant_extra_reads DLPFC_Br2720_mid_manual_alignment
15 DLPFC_Br8667_post_manual_alignment_all DLPFC_Br2720_post_extra_reads
16 DLPFC_Br2743_mid_manual_alignment_extra_reads DLPFC_Br6432_ant_2
17 DLPFC_Br3942_ant_manual_alignment DLPFC_Br6432_mid_manual_alignment
18 DLPFC_Br3942_post_manual_alignment DLPFC_Br6432_post_manual_alignment
19 DLPFC_Br6423_mid_manual_alignment DLPFC_Br6471_ant_manual_alignment_all
20 DLPFC_Br8492_ant_manual_alignment DLPFC_Br6471_mid_manual_alignment_all
21 DLPFC_Br8492_post_manual_alignment DLPFC_Br6471_post_manual_alignment_all
22 DLPFC_Br2720_mid_manual_alignment DLPFC_Br6522_ant_manual_alignment_all
23 DLPFC_Br6432_ant_2 DLPFC_Br6522_mid_manual_alignment_all
24 DLPFC_Br6432_post_manual_alignment DLPFC_Br6522_post_manual_alignment_all
25 DLPFC_Br6471_mid_manual_alignment_all DLPFC_Br8325_ant_manual_alignment_all
26 DLPFC_Br6522_ant_manual_alignment_all DLPFC_Br8325_mid_2
27 DLPFC_Br6522_post_manual_alignment_all DLPFC_Br8325_post_manual_alignment_all
28 DLPFC_Br8325_mid_2 DLPFC_Br8667_ant_extra_reads
29 DLPFC_Br8667_ant_extra_reads DLPFC_Br8667_mid_manual_alignment_all
30 DLPFC_Br8667_post_manual_alignment_all DLPFC_Br8667_post_manual_alignment_all
> identical(basename(gsub("/outs/spatial/tissue_positions_list.csv", "", xyz)), names(samples))
[1] FALSE
Got it!
> xyz <- file.path(rep(dir, each = 2),
+ sprintf("tissue_positions%s.csv",
+ rep(c("", "_list"), length(sids))))
> xyz <- xyz[file.exists(xyz)]
> length(xyz)
[1] 30
> identical(basename(gsub("/outs/spatial/tissue_positions_list.csv", "", xyz)), names(samples))
[1] TRUE
> data.frame(basename(gsub("/outs/spatial/tissue_positions_list.csv", "", xyz)), names(samples))
basename.gsub...outs.spatial.tissue_positions_list.csv....... names.samples.
1 DLPFC_Br2743_ant_manual_alignment DLPFC_Br2743_ant_manual_alignment
2 DLPFC_Br2743_mid_manual_alignment_extra_reads DLPFC_Br2743_mid_manual_alignment_extra_reads
3 DLPFC_Br2743_post_manual_alignment DLPFC_Br2743_post_manual_alignment
4 DLPFC_Br3942_ant_manual_alignment DLPFC_Br3942_ant_manual_alignment
5 DLPFC_Br3942_mid_manual_alignment DLPFC_Br3942_mid_manual_alignment
6 DLPFC_Br3942_post_manual_alignment DLPFC_Br3942_post_manual_alignment
7 DLPFC_Br6423_ant_manual_alignment_extra_reads DLPFC_Br6423_ant_manual_alignment_extra_reads
8 DLPFC_Br6423_mid_manual_alignment DLPFC_Br6423_mid_manual_alignment
9 DLPFC_Br6423_post_extra_reads DLPFC_Br6423_post_extra_reads
10 DLPFC_Br8492_ant_manual_alignment DLPFC_Br8492_ant_manual_alignment
11 DLPFC_Br8492_mid_manual_alignment_extra_reads DLPFC_Br8492_mid_manual_alignment_extra_reads
12 DLPFC_Br8492_post_manual_alignment DLPFC_Br8492_post_manual_alignment
13 DLPFC_Br2720_ant_2 DLPFC_Br2720_ant_2
14 DLPFC_Br2720_mid_manual_alignment DLPFC_Br2720_mid_manual_alignment
15 DLPFC_Br2720_post_extra_reads DLPFC_Br2720_post_extra_reads
16 DLPFC_Br6432_ant_2 DLPFC_Br6432_ant_2
17 DLPFC_Br6432_mid_manual_alignment DLPFC_Br6432_mid_manual_alignment
18 DLPFC_Br6432_post_manual_alignment DLPFC_Br6432_post_manual_alignment
19 DLPFC_Br6471_ant_manual_alignment_all DLPFC_Br6471_ant_manual_alignment_all
20 DLPFC_Br6471_mid_manual_alignment_all DLPFC_Br6471_mid_manual_alignment_all
21 DLPFC_Br6471_post_manual_alignment_all DLPFC_Br6471_post_manual_alignment_all
22 DLPFC_Br6522_ant_manual_alignment_all DLPFC_Br6522_ant_manual_alignment_all
23 DLPFC_Br6522_mid_manual_alignment_all DLPFC_Br6522_mid_manual_alignment_all
24 DLPFC_Br6522_post_manual_alignment_all DLPFC_Br6522_post_manual_alignment_all
25 DLPFC_Br8325_ant_manual_alignment_all DLPFC_Br8325_ant_manual_alignment_all
26 DLPFC_Br8325_mid_2 DLPFC_Br8325_mid_2
27 DLPFC_Br8325_post_manual_alignment_all DLPFC_Br8325_post_manual_alignment_all
28 DLPFC_Br8667_ant_extra_reads DLPFC_Br8667_ant_extra_reads
29 DLPFC_Br8667_mid_manual_alignment_all DLPFC_Br8667_mid_manual_alignment_all
30 DLPFC_Br8667_post_manual_alignment_all DLPFC_Br8667_post_manual_alignment_all
Thanks a lot for finding this @lcolladotor -- working through it now
Yes, this makes sense, and working through the same example in our directories I get the same outcome. Thanks for the detailed example and the PR. I'll try to add a unit test for the PR -- cc @HelenaLC @drighelli
Awesome, thanks Lukas! It'd be great if you could push the fix to both bioc-release and devel. Just because it's a bug fix, not a new feature. Thanks!
Yes agree this should go into both release and devel as an urgent bug fix. I just finished writing a unit test now and will push it to the PR and check some more.
Wow, thanks guys for the issue and the fast resolution of it!
That's awesome! :)
But, I have a question:
does this solution solves the different versions of tissue_position[s|list].csv
(header/not header)?
This issue and the related PR are about a separate issue. So they don't address the position file.
Do you mean the new / second issue that @lcolladotor raised in Bioc Slack for the new / upcoming version of Space Ranger? No, these are two separate issues, so this PR only solves the first one for now.
Merged and pushed to both release (version 1.8.1) and devel (version 1.9.5). Thanks all!
Hi
SpatialExperiment
authors,A few minutes ago I gave you a heads up on Slack related to a potential new
spaceranger
release (the last publicly available one is 2.0.0). See https://community-bioc.slack.com/archives/CR1BLV821/p1677783580326559 for more.This new issue is related to the same lines of code, but in a different context. It's also a dangerous one because it creates a silent error that maybe others won't notice.
@CATALYST-project took the initiative on July 29th 2022 at https://github.com/drighelli/SpatialExperiment/commit/f741025528c81abba2f2db18dc67d80c883ce0be to add support for
SpaceRanger
2.0.0 files where they changed thetissue_positions_list.csv
file totissue_positions.csv
. So that commit and a related one, introduced this flexibility. Those commits were part of version 1.7.1 which eventually was added to bioc-release as version 1.8.0 around October 2022.The problem is that this new code actually creates a silent error when you supply multiple samples to
read10xVisium()
. It's a silent error because it creates a mismatch between the sample being read and the tissue positions file read for that sample. I think that there were no unit tests for multiple samples that checked the positions info, so no one noticed this.I first noticed something was wrong a few months ago (December 2022) when I wrote https://github.com/LieberInstitute/spatialDLPFC/blob/main/code/analysis/01_build_spe/check_pxl_in_spe_objects.R. The data and code for that script is now fully public. In that script, I was very confused as to why the tissue position information didn't match between a fresh re-run of read10xVisium() (
SpatialExperiment
version 1.8.0) versus what we had in our objects created with earlier versions.I've narrowed down the issue to https://github.com/drighelli/SpatialExperiment/blob/7b8289fec1b94c37f91546e724bffc2be554bc10/R/read10xVisium.R#L127-L131
Here's how I noticed it using a mix of both https://github.com/LieberInstitute/spatialDLPFC/blob/main/code/analysis/01_build_spe/check_pxl_in_spe_objects.R and https://github.com/drighelli/SpatialExperiment/blob/master/R/read10xVisium.R.
(For @lmweber, I ran this at
/dcs04/lieber/lcolladotor/spatialDLPFC_LIBD4035/spatialDLPFC
which only matters for the xyz <-xyz[file.exists(xyz)]
line at the end)As you can see below, here we have 30 samples. But the
xyz
file path character vector for the tissue positions file does not match. Actually, xyz is 15 times longer than thesample_id
vector. If it was recycled in the same order, I wouldn't have noticed it. But since it changes the order, things break.Here's more output.
Version 1.7.1 became part of bioc-release as version 1.8.0 in October 2022, so any SPE objects that were created after that are likely suspect.
Thanks again for creating
SpatialExperiment
and keeping it updated! It's a really useful tool for all of us =)Best, Leo