pcarbo / varbvs

Large-scale Bayesian variable selection for R and MATLAB.
http://pcarbo.github.io/varbvs
GNU General Public License v3.0
42 stars 14 forks source link

keep same output for confint whether one or several variables are given #23

Closed timflutre closed 6 years ago

timflutre commented 6 years ago

In varbvs::confint(object, parm, level), if parm is a single number, then n = length(parm) = 1 and the following code is executed:

out <- unlist(out, recursive = FALSE)

As a result, the names of the output aren't intelligible anymore (see example below).

To solve this issue, the line above shouldn't be executed, or the names should be improved in the case n=1.

Here is an example:

confint(fit.varbvs, 9058, 0.95)
 chr3:80748_A/C1  chr3:80748_A/C2  chr3:80748_A/C3  chr3:80748_A/C4 
         0.08009406          0.08081136          0.08131937          0.08162632 
 chr3:80748_A/C5  chr3:80748_A/C6  chr3:80748_A/C7  chr3:80748_A/C8 
         0.08180248          0.08189056          0.08190661          0.08186117 
 chr3:80748_A/C9 chr3:80748_A/C10 chr3:80748_A/C11 chr3:80748_A/C12 
         0.08176735          0.08162886          0.08431831          0.08408115 
chr3:80748_A/C13 chr3:80748_A/C14 chr3:80748_A/C15 chr3:80748_A/C16 
         0.08428320          0.09257254          0.09220402          0.09070721 
chr3:80748_A/C17 chr3:80748_A/C18 chr3:80748_A/C19 chr3:80748_A/C20 
         0.09028690          0.08549337          0.09002682          0.08987151 
chr3:80748_A/C21 chr3:80748_A/C22 chr3:80748_A/C23 chr3:80748_A/C24 
         0.08139016          0.17967781          0.17898309          0.17824020 
chr3:80748_A/C25 chr3:80748_A/C26 chr3:80748_A/C27 chr3:80748_A/C28 
         0.17753963          0.17685822          0.17613909          0.17533153 
chr3:80748_A/C29 chr3:80748_A/C30 chr3:80748_A/C31 chr3:80748_A/C32 
         0.17439653          0.17330633          0.17201581          0.16785200 
chr3:80748_A/C33 chr3:80748_A/C34 chr3:80748_A/C35 chr3:80748_A/C36 
         0.16500887          0.16369999          0.16677399          0.16449955 
chr3:80748_A/C37 chr3:80748_A/C38 chr3:80748_A/C39 chr3:80748_A/C40 
         0.15772096          0.15502424          0.14646315          0.14710368 
chr3:80748_A/C41 chr3:80748_A/C42 
         0.14390752          0.17766431 

Compare to the case where two variables are given:

confint(fit.varbvs, c(1,9058), 0.95)
$`chr1:27655_T/C`
               2.5 %     97.5 %
theta_1  -0.07091266 0.01740136
theta_2  -0.06692214 0.02013984
theta_3  -0.06326733 0.02268549
theta_4  -0.06048395 0.02457555
theta_5  -0.05836900 0.02593022
theta_6  -0.05663844 0.02694520
theta_7  -0.05511396 0.02773964
theta_8  -0.05368315 0.02838203
theta_9  -0.05228403 0.02889813
theta_10 -0.05092149 0.02923979
theta_11 -0.05282215 0.02126187
theta_12 -0.04904717 0.02272664
theta_13 -0.04670784 0.02372726
theta_14 -0.04631619 0.01949421
theta_15 -0.04405770 0.02006414
theta_16 -0.03190869 0.02752998
theta_17 -0.03087430 0.02654772
theta_18 -0.03085270 0.02322960
theta_19 -0.02747159 0.02316019
theta_20 -0.02546096 0.02247701
averaged -0.06257837 0.02412627

$`chr3:80748_A/C`
              2.5 %    97.5 %
theta_1  0.08009406 0.1796778
theta_2  0.08081136 0.1789831
theta_3  0.08131937 0.1782402
theta_4  0.08162632 0.1775396
theta_5  0.08180248 0.1768582
theta_6  0.08189056 0.1761391
theta_7  0.08190661 0.1753315
theta_8  0.08186117 0.1743965
theta_9  0.08176735 0.1733063
theta_10 0.08162886 0.1720158
theta_11 0.08431831 0.1678520
theta_12 0.08408115 0.1650089
theta_13 0.08428320 0.1637000
theta_14 0.09257254 0.1667740
theta_15 0.09220402 0.1644996
theta_16 0.09070721 0.1577210
theta_17 0.09028690 0.1550242
theta_18 0.08549337 0.1464631
theta_19 0.09002682 0.1471037
theta_20 0.08987151 0.1439075
averaged 0.08139016 0.1776643

The difference forces user code to adapt to whether n=1 or not, to know how to retrieve the averaged CIs.

pcarbo commented 6 years ago

Thanks @timflutre. Should be fixed with the latest commit (61d5173). Please re-open if you still have issues.

timflutre commented 6 years ago

I would prefer to keep the same exact output structure, i.e. a list, whatever the length of parm (n). Whereas, with your latest commit, it's much better than before, but the output is a list (n > 1) or a matrix (n == 1). In the latter case, why not keep a list, even if it has only a single component? This way, the user code can use the following command to extract the averaged CI whatever the value of n:

CIs <- confint(fit.varbvs, idx, 0.95)
CIs <- do.call(rbind, lapply(CIs, function(x) x[nrow(x), ]))
timflutre commented 6 years ago

Moreover, using a list as output whatever the value of n would solve a current bug in summary.varbvs. Indeed, in the current code, summary.varbvs returns an error when retrieving the averaged CIs when n == 1 (which can happen when pip.cutoff is used...):

> for(i in ...){
   fit.varbvs <- varbvs(...)
   print(summary(fit.varbvs))
}
Error in x[ns + 1, ] : incorrect number of dimensions
> traceback()
7: FUN(X[[i]], ...)
6: lapply(CIs, function(x) x[ns + 1, ])
5: do.call(rbind, lapply(CIs, function(x) x[ns + 1, ]))
4: summary.varbvs(fit.varbvs)
3: summary(fit.varbvs)
2: summary(fit.varbvs)
1: print(summary(fit.varbvs))
pcarbo commented 6 years ago

@timflutre I think the latest commit (4add2bc) addresses these issues. Thank you for pointing them out. I think I still could address these issues more carefully, but this fix is good enough for now.

pcarbo commented 6 years ago

There was another lingering bug… this should now be fixed in version 2.5-6 (commit c24c17e).

timflutre commented 6 years ago

Perfect, many thanks Peter!