BergmannLab / MONET

MONET : MOdularising NEtwork Toolbox - https://doi.org/10.1093/bioinformatics/btaa236
GNU General Public License v3.0
41 stars 15 forks source link

Single nodes in the cluster #55

Open Jordi-V opened 2 years ago

Jordi-V commented 2 years ago

Hi all,

I'have a question about the MONET result using K1 approach. I obtained different clusters, however, when I plot the modules into cytoscape, I see most of the genes connected between them and some of them are disconnected from the cluster... These disconnected nodes, have edges to other clusters, but no into the cluster where MONET grouped the gene....

Why I have disconnected nodes into a module? and why this node is not clustered into other module where are edges between genes?

Thanks for your time,

Jordi

jjc2718 commented 2 years ago

Hi Jordi - thanks for your interest in MONET and our K1 method. We discussed this a bit previously in another GitHub issue: https://github.com/BergmannLab/MONET/issues/41#issuecomment-734498842

Let me know if you have questions that aren't answered in that thread, happy to help however I can.

Jordi-Valls commented 2 years ago

Hi jjc2718, thanks for your link! (I'm Jordi-V)

So we obtain disconnected nodes in our clusters due to DSD? DSD uses a random walk strategy, so as I understand, in one cluster, the visitation frequency is higher than other cluster where the gene is linked with another gene? and the visitation frequency is used to group genes into a clusters? If you can explain me better what you referr as "capture similarities"... Because I understand as capture similarities as genes with high visitantion frequency, I'm right?

jjc2718 commented 2 years ago

So we obtain disconnected nodes in our clusters due to DSD?

Exactly - this is a feature of DSD, that nodes that are nearby in DSD distance don't necessarily have to be connected in the original network. We've observed this in the past, in our DREAM challenge submissions as well as in other work.

DSD uses a random walk strategy, so as I understand, in one cluster, the visitation frequency is higher than other cluster where the gene is linked with another gene? and the visitation frequency is used to group genes into a clusters?

DSD is a random walk-based method, but it's not quite as simple as just clustering on the visitation frequency. The basic idea is that the distance between two nodes is a function of the difference between their random walk vectors to all other nodes in the network. Because it's a global measure of distance (i.e. it considers the entire network, not just the two nodes being compared), nodes can end up close together in DSD distance even if they aren't close together in shortest path distance, or in traditional random walk similarity measures.

This idea is discussed in more detail in the introduction to the original DSD paper. If you still have questions about the method after looking at that paper I'd suggest emailing Lenore Cowen (she's one of the corresponding authors on that paper, and knows far more than I do about how the method works and why). Feel free to CC me if you do email her - my email is the same as my GitHub username, at gmail.com.

If you can explain me better what you referr as "capture similarities"... Because I understand as capture similarities as genes with high visitantion frequency, I'm right?

By "capture similarities in global network structure" I just meant that DSD considers the structure of the network as a whole (sometimes referred to as "global network properties"), rather than just the "local" distance or similarity between the two nodes.

Hope that helps a bit!

Jordi-Valls commented 2 years ago

Hi Jacke,

Thanks so mucho for your answer! I will read this paper to further understand the rationale behind random walk :dagger: If I contact to Lenore Cowen I will include it in the mail for sure.

Thanks for your time!

Jordi