general ash - Githubissues

stephens999 / ashr

An R package for adaptive shrinkage

GNU General Public License v3.0

79 stars 35 forks source link

general ash #26

Open stephens999 opened 8 years ago

stephens999 commented 8 years ago

@mengyin can you take a look at the code I've got in my general branch? I think it incorporates much, but not yet all, of what you had in your branch, but without all the repetition. It seems to run OK. There are still some things to do, but thought it would be useful for you to look and see if, for example, the results for logF errors look similar to yours. (note that for now I think df1 and df2 have to be the same for all observations...; changing this should not be too hard, but not done yet)

To see how to run it, see my test in tests/testthat/test_logF.R set.seed(10) e = rnorm(100)+log(rf(100,df1=10,df2=10)) e.ash = ash(e,1,lik=logF_lik(df1=10,df2=10)) s = sd(log(rf(10000,10,10))) # get approximate standard deviation by simulation e.ash.norm = ash(e,s) expect_equal(e.ash.norm$PosteriorMean,e.ash$PosteriorMean,tol=0.05)

and also the examples under ?ash.workhorse

mengyin commented 8 years ago

Yes the general branch's results look very similar to my old results! Anyway in my current simulations df1 and df2 are always the same for all observations, so it is definitely fine...

stephens999 commented 8 years ago

that's good. A few things to resolve:

you wrote a replacement to my "gradient" function that seems to be a fix to a divide by zero problem. I guess you got some errors that caused this problem, and we could just replace my function with yours?
still need to deal with the autoselect issue, and work out how to let the user specify a way to select a grid that depends on the likelihood. Maybe the autoselect method should be specified when specifying likelihood (or a default supplied)
may want to allow the user to just specify pdf of likelihood, and compute cdf by integrate? This might be a future feature.
I have an implementation that allows each observation to have its own likelihood about to be pushed to the general-multilik branch, but I'm not 100% happy with the way I did it... still thinking about it.

mengyin commented 8 years ago

Yes I got errors sometimes with the old "gradient" function, replacing it by my new gradient function fixed the problem divide by zero and didn't affect results in general.
For the autoselect function, my first thought was choosing the grid based on the shape of error distribution...but computing the distribution's moments itself can be non-trivial (numerical integration is unstable). Anyway I guess it's no big deal as long as the chosen grid's range is wide enough.
Yes it'd be great if the user just need specify pdf (or just cdf and compute pdf by numerical differentiation? might be unstable though)!

stephens999 commented 8 years ago

for autoselect, maybe we could use the cdf instead of the moments? I only used moments in the normal case as a shorthand for making sure the grid spanned "plausible" values.

mengyin commented 8 years ago

I see..say, use the values where the cdf is almost 1 (or 0) to decide the "plausible" range?