sjspielman / dragon

Deep time Redox Analysis of the Geobiology Ontology Network
https://sjspielman.shinyapps.io/dragon/
Other
2 stars 1 forks source link

Unable to plot mineral COV by community cluster for Arsenic (As) network #14

Closed mooreek closed 5 years ago

mooreek commented 5 years ago

Hi,

After constructing the Arsenic (As) network with separate nodes for each element redox state I tried to plot community cluster (independent variable by COV (dependent variable). The community cluster method is Louvain. I got the following message: "ERROR: There is insufficient data to perform this analysis. Please specify a different network." When I change the cluster method to Leading Eigenvector the plot works.

Is there a miniumum number of nodes that need to be present in each cluster for this plot to work? I ask because there are more clusters using the Louvain method, so that means that the clusters will have fewer nodes compared to the Leading Eigenvector method. I would prefer to use the Louvain method because it gives greater separation between community clusters. One of the Louvain clusters only contains 2 nodes (cluster 6). This cluster is not important to the analysis, so is there a way to change the criteria to plot COV by community cluster?

Thank you very much for your help. Best, Eli

sjspielman commented 5 years ago

Depending which clustering algorithm is used, somewhat different clusters will be detected. In this case, at least one Louvain clusters has fewer than 3 nodes, which you noted, and therefore it is not statistically valid to model the data and any results are unreliable and spurious. The error you see is there to prevent errors in modeling resulting from incorrect model specification.

There is no option to exclude certain clusters from analysis because this would be a posthoc and statistically unjustified "data dredging" - the clusters are what the clusters are. More tailored analysis may be merited here, but adding/removing/combining clusters without rigorous statistical justification is never an option.

Whichever criteria is set will be the one used for plotting, so you should be able to select Eigenvector and then plot accordingly. Is this not happening for you? A screenshot might help?

-Stephanie

mooreek commented 5 years ago

Okay, I will just go with leading eigen. The Louvain clusters were more informative, but if they statistically cannot be used then there is nothing that can be done.

sjspielman commented 5 years ago

It may well be possible to do a more tailored analysis with Louvain, but it would be too specific for the general scope of dragon. We can talk about this more in person soon!

sjspielman commented 5 years ago

This should be fixed - after some consideration, I have decided while it is not ok to merge or separate clusters arbitrarily, it is not terribly unreasonable to only model clusters of interest. The analysis tab has been redesigned to accommodate a lot of what you're looking for - let's see how it goes!