USGS-R / drb-estuary-salinity-ml

Creative Commons Zero v1.0 Universal
0 stars 4 forks source link

Testing information theory functions #65

Open galengorski opened 2 years ago

galengorski commented 2 years ago

I'd like to come up with a series of tests to benchmark and get a feel for the information theory functions. In general the goals would be:

  1. Compare results to other methods/code libraries to make sure we are producing similar answers and the functions are technically sound
  2. Create examples that help develop intuition for what these functions tell us and how to interpret their results
  3. Conduct sensitivity testing to understand how decisions like bin numbers and calculating critical values affect the final results
galengorski commented 2 years ago

I would also like to test how different pre-processing approaches affect the calculations, normalization vs. standardization, anomalies, difference from DOY average

galengorski commented 2 years ago

Are information theory metrics sensitive to autocorrelation? Idea for testing this from @jds485: [9:03 AM] Smith, Jared D I was thinking that we could numerically test what the significance threshold should be by generating synthetic timeseries with known autocorrelations, compute MI, and MIcrit, and determine how much we need to change MIcrit for MI to not be significant. It's not a perfect correction, but it can give a sense for how large the correction could get

jds485 commented 2 years ago

Block shuffling the timeseries instead of shuffling single observations to get MIcrit could be another way to estimate a more correct MIcrit value when there is autocorrelation. The block size would need to change based on the autocorrelation value.