neurodata / MGC-paper

MGC: multiscale graph correlation (pronounced "magic")
http://neurodata.io
Apache License 2.0
2 stars 10 forks source link

plot power.hat for (k,l) and power.true for (k,l) in side-by-side heatmaps for all 20 simulations #44

Closed jovo closed 8 years ago

jovo commented 8 years ago

we might not include any of these, but we should definitely look at it before submitting.

cshen6 commented 8 years ago

Hi Joshua,

I have tried out the powers, as well as keeps on improving the scale estimation step...

The method I am using right now, uses

  1. directly re-sampling on the upper-diagonal entries of the distance matrices (rather than kernel density estimation)
  2. then uses half the re-sampled distance matrices for optimal scale estimation;
  3. multiple the optimal scale for half data by 2, to yield the optimal scale for the full data.

Note: direct re-sampling improves the permutation test power, comparing to kde. Using half data for scale estimation rather than full data reduces p-value bias.

For power comparison, they are attached at plots below, for dimension 1 at n=50, and high-d at n=100.

I compare global mcorr power, true MGC power by model, estimated MGC power by one data set in the figure. You can see our procedure did reasonably well, considering we only used one data set for estimation!

Let me know I you have any question/suggestion regarding the procedure/plots...For the next few days, I will keep on trying for a better procedure that is close to unbiased and maintains reasonable power, then update our draft on github before/on weekend.

--Cencheng

On Thu, Mar 17, 2016 at 11:38 AM, joshua vogelstein < notifications@github.com> wrote:

Assigned #44 https://github.com/jovo/RankdCorr/issues/44 to @cshen6 https://github.com/cshen6.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jovo/RankdCorr/issues/44#event-593596404

cshen6 commented 8 years ago

great! i'd still love to see the heatmaps for power vs (k,l) for your procedure, and compared to the "real" power....

On Tue, Mar 22, 2016 at 10:09 AM, Cencheng Shen cshen@temple.edu wrote:

Hi Joshua,

I have tried out the powers, as well as keeps on improving the scale estimation step...

The method I am using right now, uses

  1. directly re-sampling on the upper-diagonal entries of the distance matrices (rather than kernel density estimation)
  2. then uses half the re-sampled distance matrices for optimal scale estimation;
  3. multiple the optimal scale for half data by 2, to yield the optimal scale for the full data.

Note: direct re-sampling improves the permutation test power, comparing to kde. Using half data for scale estimation rather than full data reduces p-value bias.

For power comparison, they are attached at plots below, for dimension 1 at n=50, and high-d at n=100.

I compare global mcorr power, true MGC power by model, estimated MGC power by one data set in the figure. You can see our procedure did reasonably well, considering we only used one data set for estimation!

Let me know I you have any question/suggestion regarding the procedure/plots...For the next few days, I will keep on trying for a better procedure that is close to unbiased and maintains reasonable power, then update our draft on github before/on weekend.

--Cencheng

On Thu, Mar 17, 2016 at 11:38 AM, joshua vogelstein < notifications@github.com> wrote:

Assigned #44 https://github.com/jovo/RankdCorr/issues/44 to @cshen6 https://github.com/cshen6.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jovo/RankdCorr/issues/44#event-593596404

the glass is all full: half water, half air. neurodata.io

cshen6 commented 8 years ago

I remembered the heatmaps are slightly different, depending on the function types.

But I will show that to you this week, once I do that for all function types.

On Tue, Mar 22, 2016 at 1:02 PM, joshua vogelstein jovo@jhu.edu wrote:

great! i'd still love to see the heatmaps for power vs (k,l) for your procedure, and compared to the "real" power....

On Tue, Mar 22, 2016 at 10:09 AM, Cencheng Shen cshen@temple.edu wrote:

Hi Joshua,

I have tried out the powers, as well as keeps on improving the scale estimation step...

The method I am using right now, uses

  1. directly re-sampling on the upper-diagonal entries of the distance matrices (rather than kernel density estimation)
  2. then uses half the re-sampled distance matrices for optimal scale estimation;
  3. multiple the optimal scale for half data by 2, to yield the optimal scale for the full data.

Note: direct re-sampling improves the permutation test power, comparing to kde. Using half data for scale estimation rather than full data reduces p-value bias.

For power comparison, they are attached at plots below, for dimension 1 at n=50, and high-d at n=100.

I compare global mcorr power, true MGC power by model, estimated MGC power by one data set in the figure. You can see our procedure did reasonably well, considering we only used one data set for estimation!

Let me know I you have any question/suggestion regarding the procedure/plots...For the next few days, I will keep on trying for a better procedure that is close to unbiased and maintains reasonable power, then update our draft on github before/on weekend.

--Cencheng

On Thu, Mar 17, 2016 at 11:38 AM, joshua vogelstein < notifications@github.com> wrote:

Assigned #44 https://github.com/jovo/RankdCorr/issues/44 to @cshen6 https://github.com/cshen6.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jovo/RankdCorr/issues/44#event-593596404

the glass is all full: half water, half air. neurodata.io