Closed j-bac closed 2 years ago
Hi!
That's an interesting idea. I'm not so familiar with it, but there are people who work on Gaussian Processes defined over graphs. What you need is a covariance function that takes two graph nodes and returns the covariance between values in those nodes.
How to implement the graph covariance I'm not completely sure, but the factorization trick we use here to speed up computation should still work. It works directly on the covariance matrix and doesn't care how that covariance matrix was constructed.
thanks, I'll look into it! Also the link to download the MouseOB data seems broken, do you still have it by any chance ?
Hi,
Let me know if you make progress!
Yes sorry about that. Git LFS ended up causing a lot of problems. I put a version of repo with all the data in it here: https://figshare.com/articles/software/SpatialDE/17065217
I'll add the link to the README. Thansk for the reminder!
Thanks! I appreciate how well organized the repo is.
Actually I realized the method already works on higher-dimensional inputs without modification, I think it's a cool usage to detect "interesting" highly variable genes that show pattern and are not just distributed in a random pattern. It's useful even for a scRNAseq assay without true spatial coordinates
I think generalizing this to graphs is still a nice project - I found implementation of graph GP and I'm playing with it. E.g. it would be cool if we can apply this to compare scATAC/scRNA graphs and detect multi-modal gene clusters with similar patterns in both assays
Oh yeah the Euclidean distance calculation already works for any dimension.
One reason people prefer to use graphs in high-dimensional space is that you get this unintuitive issue of space getting "more empty" as dimensionality increases. Then, since the GP pretty much interpolates between observed points, this means for higher dimensions it is harder to reject the null hypothesis that noise is uncorrelated. With a graph representation the geometric property of 'empty' high dimensional space would be less of an issue.
In addition to the fact that with a graph you could analyze all sorts of weird data that doesn't have actual numerical coordinates! Like, protein structure, citation networks, gene ontology, etc.
Thanks, those are all cool ideas. It is true using angular distance rather than euclidean is often better for scRNA. I close the issue but I'm happy to keep you up if I make progress !
thanks for the package!