Closed hspitzer closed 2 years ago
good point. The problem is that there are many parameters that shouldbe exposed, since if true it's a simple function, it wraps quite complex processing steps (where it's key that the user might have to change paramters).
I would see a better option a function that takes a adata_parent
and a key in obsm, and return an adata_child
with same obs, var
as adata_parent
.
This is btw very related to the biggest problem of having multi modal data in anndata 😅 and we would not be the only ones facing this...
Yes, I agree that this wraps quite complicated processing steps. Maybe they should be explicitly visible for the user. Its just that moving the obsm back and forth is a bit ugly.
Ok, sure so you are proposing a function moving obsm
to X
, right?
So this would translate to:
adata_features = move_obsm(adata, key="features")
sc.pp.scale(adata_features)
sc.pp.pca(adata_features)
sc.pp.neighbors(adata_features)
sc.tl.leiden(adata_features)
and then you can use adata_features
directly for sc.pl.spatial
because it already contains the gene clusters. Yeah, that could work.
To deal with features efficiently though, I need some sort of mechanism to select which rows of obsm
to move (I do that with the like
parameter in the function above).
Ok, sure so you are proposing a function moving obsm to X, right? So this would translate to:
yes, something like that.
and then you can use adata_features directly for sc.pl.spatial because it already contains the gene clusters. Yeah, that could work.
yes indeed, in that case youd' have to copy over also adata.uns
for images and related metadata
To deal with features efficiently though, I need some sort of mechanism to select which rows of obsm to move (I do that with the like parameter in the function above).
this features are what is moved in adata.X
right? Wouldn't it work to just move everything?
To deal with features efficiently though, I need some sort of mechanism to select which rows of obsm to move (I do that with the like parameter in the function above).
this features are what is moved in
adata.X
right? Wouldn't it work to just move everything?
I usually extract all features at once because this is more efficient. In some of the tutorial though I am showing the clustering for only a subset of the features (e.g. only segmentation features or only texture features). For this we need to have a way to filter the pandas table. I can also do that manually, but at this point there is no need to me to use such an extraction function at all.
My point is that I'd like to keep the example notebooks as short as possible, and was wondering if we could make some utility functions that do these steps for us.
ok yes, then making an extractor similar to what we alredy have I think might makes sense. Maybe teh extractor we have can me modified? also understand now about selecting specific features
Yeah, it would be nice to use the extractor for this, but currently sc.pl.extract
does obsm -> obs
. We are talking about obsm -> X
. I'm not sure if its best practice to put these two different functionalities in one function? We could have a "destination" argument that can be either obs
or X
?
I'm not sure if its best practice to put these two different functionalities in one function? We could have a "destination" argument that can be either obs or X?
I like this idea!
I htink this is now done with extract
and several tutorials, will close this.
Is it? Does extract now also extract obsm -> X
? Would still be great to have. Not super urgent though.
it would be cool to have a multiplex partition based on layers/obsm see this https://github.com/theislab/scanpy/issues/1818
When writing tutorials, I find myself defining the same clustering function in several notebooks.
This essentially does scaling+PCA+neighbors+leiden on a set of features. I was wondering if we should include this in squidpy as a convenience function (maybe made a bit more general)? Or should we rather leave these sort of functions outside of squidpy? Is there a solution that I can avoid defining the same function in several notebooks? @giovp