SAND-Lab / MEA-NAP

MEA-NAP. A streamlined diagnostic and analytic tool for data obtained using microelectrode arrays.
GNU General Public License v3.0
12 stars 6 forks source link

Rename Params.excludeEdgesBelowThreshold #32

Closed hhs2 closed 6 months ago

hhs2 commented 1 year ago

ExtractNetMet line 135 Params.excludeEdgesBelowThreshold = 1 --> remove values < 0 from density calculations

I think the confusion here is between whether we want this to affect the figure generation (removing zero values from the plots) and/or whether we want it to affect the calculation of network metrics. If just the latter, then I think setting it to 1 should make it skip the probabilistic thresholding step. If just the former, then I think we want to use the threshold value set in ExtractNetMet line 72 rather than removing zeros or rename it to something like Params.excludeBelowZeroValuesInFig

(1) This may already be done in the probabilstic thresholding? (2) excludeEdgesBelowThreshold could be renamed to excludeBelowZeroValues (3) If we want to just use the unthresholded adjacency matrix then it would be MEApipeline.m: if Params.excludeEdgesBelowThreshold adjMs = correlation + threshold else adjMs = correlation end

Timothysit commented 1 year ago

I think we want this to only affect the calculation of network metrics. (In practice, currently in the pipeline it won't be able to affect the plotting anyway I think because the line-width will be 0 when the edge weight is 0 and negative edge weight is not supported in the current plotting code I think)

"then I think setting it to 1 should make it skip the probabilistic thresholding step" : I am not 100% sure. I think this feature was added when Susanna bought up the idea of whether eg. the mean edge weight should only consider above zero edge weights. So having this parameter will basically allow you to toggle between those two definitions. But on the other hand for some other network metrics the zero are actually informative / essential for the definition of the metric (see paragraph below copied from Slack)). Now that I think about it, I think perhaps it makes more sense to have two mean edge weight metric with two definitions rather than having this parameter, as it likely makes the calculation of other network metrics "invalid" in the sense that other metrics assume there may be edges with 0 weight. What do you think?

I also want to bring up whether there are actually two thresholds being considered here:

  1. The probabilistic threshold: my understanding (please correct if I am wrong, I haven't read the methods in detail) is that it compares whether the existing edge weight value is greater than the edge weight value from some percentile of a distribution generated by shuffling the spike times.
  2. The hard edge weight threshold that we are setting here with Params.excludeEdgesBelowThreshold, which currently is just 0.

So, let's say a particular edge has a value of 0.1, and after performing the probabilistic thresholding it passes the percentile threshold in (1), so it is a significant 0.1 weight. However, the user may still consider this connection too weak to be included in subsequent analysis, and so we want to either (a) set this to zero or (b) set it to essentially "nan" eg. when calculating mean edge weight. This will mean replacing the existing Params.excludeEdgesBelowThreshold to a parameter, eg. Params.analysisEdgeWeightThreshold so that the user can decide on this particular value (an we will set it to zero by default, just to ignore negative correlations). What do you think?

There's also the rare case where you can have a edge weight that is negative (eg. -0.4), and it passes threshold (1) either by chance or depending on how probabilistic threshold is implemented it may be able to test the significance of anti-correlation, but you still want to exclude it because you don't want to analyse anti-correlated activity.


Slack message from a long time ago for reference:

I have also added the parameter Params.excludeEdgesBelowThreshold = 1; in biAdvancedSettings.m, to toggle this behaviour on/off (and I will do this for the downstream mean edge weight / edge weight calculations as well). I have been thinking about whether / when it’s appropriate to use edge weight excluding zero (sub-threshold weights are set to zero currently) and when to use the edge weight including zeros: I think looking at only at the edge weights that crosses a significant threshold makes sense, taking the mean of only the significant edges also makes sense, but I think there are also other network measures such as the network density and clustering coefficient where “0” is actually used (and there is no easy way to exclude them without changing the definition of the metric), and these measures are more related to the mean edge weight taking account of the zeros (And the thresholding we are doing is somewhere in between calculating the network metrics from all the edge weights versus binarising the network by some threshold), so in the long run it may actually be helpful to have both metrics, since “significant” mean edge weight is useful in the sense that we want to have some connectivity measure that cannot be mainly driven by eg. number of nodes with no connection at all, but it will be also be a good sanity check that some other network connectivity metrics we calculate is related to the edge weights taking account of zero values, and if downstream we want to look at trends in network metrics whilst controlling for edge weights then the edge weight taking account of zero values also makes more sense because those are the edge weights that in used in the calculation

Timothysit commented 6 months ago

Closed for now due to inactivity. Currently pipeline works as follows: probabilistic threshold, then calculates node degree, mean edge weight excluding values equal or below zero if Params.excludeEdgesBelowThreshold = 1. Will refer to this issue / open it again if we change how this is calculated.