Open aditisk opened 5 years ago
I know the question is quite old, but maybe someone else will stumble upon it and in that case, I'd like to give a solution I used with my data.
To check for expression you need to access raw matrix of counts in your data. That is, it can be log transformed and normalised, but shouldn't be scaled or regressed. Following many of the tutorials you should have the matrix in your .raw
slot.
gene1 = 'XXX'
gene2 = 'YYY
adata.obs['CoEx'] = (adata.raw[:,'{}'.format(gene1)].X.todense() > 0) &
(adata.raw[:,'{}'.format(gene2)].X.todense() > 0)
That will add to your anndata object one more metric, which you can then use to colour your umap plot (i.e. sc.pl.umap(adata, color='CoEx')
).
One thing quite annoying with this solution is that you'll end up with a meaningless colorbar on your umap plots. I welcome suggestions on how to improve it.
Thank you! I hope this helps people stumbling upon this.
By now we have https://scanpy.discourse.group which is the better place for questions like that :+1:
About the colorbar thing, you could just deep-dive into the plot object and remove it:
ax = plt.subplot()
sc.pl.xx(..., ax=ax)
ax.images.im[-1].colorbar.remove()
Hi @GMaciag,
This looks like a simple function that people may like to use. Do you want to write a small helper function for this maybe? This might be nice to add to sc.tl
.
One way you could make it display nicely in sc.pl.umap()
is by turning the values into pd.Categorical
. In the end you want to show which cells are co-expressing your genes.
Also, this may be a good use for imputation methods. Otherwise you may struggle with the sparsity of the data.
Hi @LuckyMD,
Sure, I'll work on it, as time allows. Before however, I have a couple of questions.
pd.Categorical(ordered = True)
, however, that doesn't help. And thanks @flying-sheep for showing how to remove the colourbar. I wanted to do it for some of my other plots, so that really helps.
Regarding Q3 from my previous comment, I tried few things and I think it is the easiest to keep the coexpression data as continuous and remove the colorbar afterwards.
I have, however, correction to what what was written before. ax.images.im[-1].colorbar.remove()
doesn't work (in the case of umap) since it is a scatter plot. ax.collections[-1].colorbar.remove()
needs to be used instead.
Hey! Sorry for the late reply:
cell_selection_by_genes()
or just cell_selection()
.sort_order
keyword for plotting which works for continuous covariates. I imagine that should work.sc.external
and DCA is also easily usable in this framework. They are not part of the core package though.Hey! This
gene1 = 'XXX'
gene2 = 'YYY
adata.obs['CoEx'] = (adata.raw[:,'{}'.format(gene1)].X.todense() > 0) &
(adata.raw[:,'{}'.format(gene2)].X.todense() > 0)
was very helpful, thanks for posting, @GMaciag !
Is there a similar way to split the dataset into gene1/gene2+ and gene1/gene2- cells, so that both of the categories are within 'CoEx' and you could compare them to each other just like leiden clusters for example?
Hi @natalkon
If you mean to plot them as categories instead of a continuous scale, then the solution is to turn the values into pd.Categorical
like LuckyMD mentioned.
coex = (adata.raw[:,'{}'.format(gene1)].X.todense() > 0) &
(adata.raw[:,'{}'.format(gene2)].X.todense() > 0)
coex_list = [item for sublist in coex.tolist() for item in sublist]
adata.obs['CoEx'] = pd.Categorical(coex_list, categories=[True, False])
Like I mentioned before, one problem is that the True
(coexpressing) cells are not always plotted on top when plotting both categories with umap. A better way of visualising is to make use of the groups
parameter:
sc.pl.umap(adata, color='CoEx', groups=[True])
It will then grey out all the False
cells and put them in the background.
Hi
It's been a long time since the initial talk about writing a function addressing this issue, but with the whole pandemic situation I totally forgot about it. But the recent comments made me remember and so I finished writing it and put it in the PR #1657.
I hope it can be useful to other people and maybe even included in the main package :)
Thank you very much, @GMaciag !!
Is there a way to extract cells that are co-expressing my genes of interest ? How can I make a UMAP plot showing these cells similar to the single gene UMAP plots ?
Thanks.