Teichlab / MultiMAP

MultiMAP for integration of single cell multi-omics
MIT License
53 stars 12 forks source link

MultiMAP using precomputed distance matrices #11

Open ulo opened 4 months ago

ulo commented 4 months ago

Hi! I would like to try MultiMAP for integrating a variety of datasets on a specific protein family. This includes classical omics data like RNA-seq or targeted metabolomics, and also structural information and protein annotations. For each of the different data modalities, I derived distance matrices which I can generate nice individual UMAPs on (using the metric='precomputed' parameter). And as MultiMAP is a generalization of UMAP, would it theoretically work to create a MultiMAP based on these distances? I am aware that the current implementation does not support this, but I would like to know if conceptually this would be possible. I tried to figure out by looking at your source code, but I am unfortunately neither a mathematician nor a Python expert...

Many thanks! Ulrich

ktpolanski commented 4 months ago

I had a root around Mika's source code and in principle this could be possible (not with the current code, like you noticed). The algorithm generates a set of distances and KNN indices for each individual dataset, and then for each pairwise combination of datasets. I'm not sure how that would work with your data though, as you need shared features to compute a pairwise representation.