drighelli / SpatialExperiment

55 stars 20 forks source link

`read10xVisium` tries to add `outs` to my paths #123

Open LTLA opened 2 years ago

LTLA commented 2 years ago

Discussed in #76, but someone must have introduced a regression somewhere down the line.

These lines of code:

https://github.com/drighelli/SpatialExperiment/blob/fbc21539ad3b8ec73c167b7bf4fff798e72f2d46/R/read10xVisium.R#L114-L115

append outs to the paths if it doesn't already end with outs.

This causes problems for the following workflow:

  1. Run Spaceranger.
  2. Move the outs directory to some other location, because that's all we care about.
  3. Delete the rest of the directory.
  4. Now the previously-named outs directory does not end with outs, because we moved it.
  5. read10xVisium breaks because it unnecessarily adds outs to the end of the path.

As I asked in #76, you should not be adding outs to the end of my paths.

lmweber commented 2 years ago

Hmm, I had thought our solution above balanced the two sets of use cases, but maybe not.

Just so I am understanding correctly -- are you moving your outs/ directories somewhere else and then renaming them directly as the sample IDs? Would it not be simpler (and safer to avoid errors) to keep / copy the whole tree starting from the sample ID (and then deleting all the other stuff alongside outs/ in the directory)?

I'm trying to think if there is an additional simple check we could do for this file setup. I would like to avoid requiring all users to add outs/ manually in their paths since this makes things complicated for beginners who may not know anything about this outs/ directory or the general directory structure.

How do you do it in DropletUtils read10xCounts()? (I think require the entire directory tree including outs/filtered_feature_bc_matrix/?) Maybe in the end this is the only completely unambiguous way to do it, although it does require assuming more user knowledge (although maybe this is ok).

drighelli commented 2 years ago

Maybe we can think about switching the read10xVisium into the TENxIO package, it could be safer for the import of the data and lightens our package from dependencies.

LTLA commented 2 years ago

Just so I am understanding correctly -- are you moving your outs/ directories somewhere else and then renaming them directly as the sample IDs?

Yes. The final name is arbitrary but will not end with outs.

Would it not be simpler (and safer to avoid errors) to keep / copy the whole tree starting from the sample ID (and then deleting all the other stuff alongside outs/ in the directory)?

I doubt it. My approach is this:

mv sample/outs/ somewhere_else/
rm -r sample/

Doesn't get much simpler or safer than that.

How do you do it in DropletUtils read10xCounts()? (I think require the entire directory tree including outs/filtered_feature_bc_matrix/?)

Yes.

Maybe in the end this is the only completely unambiguous way to do it, although it does require assuming more user knowledge (although maybe this is ok).

Unfortunately, you've painted yourself into a corner with your initial decision to "make things easier for users".

The only real way out is what I proposed in #76, which preserves back-compatibility for existing users via samples= while allowing exact specification of the path via dirs=.