ras44 / gcp_jupyterlab_001

0 stars 0 forks source link

understand rescaled Zreps #2

Open ras44 opened 4 years ago

ras44 commented 4 years ago

We introduce rescaled Zreps using the kappas, gammas:

https://github.com/ras44/gcp_jupyterlab_001/blob/acd216caaa6da2d865c05b874943630abc3c1f7d/mac_run.py#L62-L63

And we calculate the FWER based on a rescaled and shifted Zreps, the threshold value above which we reject.

We should expect for uncorrelated and perfectly correlated cases:

We then compare the original null FWERs with the FWERs derived from rescaled and shifted Zreps. This is done via the gini score. In DP we rescale all Zreps and in DPone we only rescale the first dimension: https://github.com/ras44/gcp_jupyterlab_001/blob/acd216caaa6da2d865c05b874943630abc3c1f7d/mac_run.py#L68

Questions:

Is the theory then that we can reduce the FWER if we "reject if the measured T^i is > than the rescaled Zrep"?

If so, then how are the rescaled Zreps calculated in practice?

If rescaling the Zrep is calculated by sampling the correlated distributions, in the perfectly correlated case, does it make sense to be rescaling Zrep to reduce the FWER based on random chance? Don't we know FWER should be exactly what it is in the 1D case?

michaelaclayton commented 4 years ago

well... I'm really just defining a `joint' p-value that i can use to decide whether the result of all the single tests represents an unusual result when taken together. Once I have that and have determined the statistic distributions under the null and alternative, the calculation of the TPR/gini is standard.

In practice the scaling of the Zrep's will be determined through simulation from the data generating model. (well, ok, I would simulate to determine the exact distribution under the null and alternative, and do the corresponding cdf calcs.)

perfect correlation: yes! (under the null and alternative) means that the two tests are giving us exactly the same information, and really you should end up with exactly the same results as if you only ran one test -- which you do here.
In bonferroni's world you don't (can't?) know this, and you adjust the thresholds anyway...

The DP/gini is just a way to arrive at a single number to use to talk about the discriminatory power of a test. Not necessary really, but I kindof like it... TPR is enough probably.

I should be writing this up (somewhat) more carefully shortly -- possibly it will sound more coherent. possibly not. I will share.

ras44 commented 4 years ago

I keep starting responses and then not finishing them. If you end up writing it up, I'd be interested in checking it out. I think I'm missing how thing would work in practice- from the data generating model, to the estimation of the correlation, to the output of whether a particular measurement is classified as being from the null or alternate distribution. There has to be an individual-test-level classification for each set of measurements. It would be interesting to simulate the data with some simple model and then see how FWER is calculated using no p-value rescaling, Bonferroni rescaling (as a comparison), and then this new method.

michaelaclayton commented 4 years ago

response_container_BBPPID{font-family: initial; font-size:initial; color: initial;} Yeah, me too.  I have most of it written, but when I try to finish it off I find that i've tried to put two things together that don't really fit.  And separately they aren't sufficiently interesting... i'll let you know when i've sufficiently mangled it... Michael A. Clayton From: notifications@github.comSent: November 5, 2019 6:09 AMTo: gcp_jupyterlab_001@noreply.github.comReply-to: reply@reply.github.comCc: michael.clayton@sympatico.ca; comment@noreply.github.comSubject: Re: [ras44/gcp_jupyterlab_001] understand rescaled Zreps (#2) I keep starting responses and then not finishing them. If you end up writing it up, I'd be interested in checking it out. I think I'm missing how thing would work in practice- from the data generating model, to the estimation of the correlation, to the output of whether a particular measurement is classified as being from the null or alternate distribution. There has to be an individual-test-level classification for each set of measurements. It would be interesting to simulate the data with some simple model and then see how FWER is calculated using no p-value rescaling, Bonferroni rescaling (as a comparison), and then this new method.

—You are receiving this because you commented.Reply to this email directly, view it on GitHub, or unsubscribe.