statdivlab / corncob

Count Regression for Correlated Observations with the Beta-binomial
102 stars 22 forks source link

extract model data #113

Open Midnighter opened 3 years ago

Midnighter commented 3 years ago

When I started using differentialTest, I was struggling a bit to extract more information beyond the p-values that are exposed on the result object. I then looked at the code for plot.differentialTest, copied that, and modified it for my purposes. Specifically, I was looking for the differential abundance values and standard errors for all taxa.

I think it would be great to expose the code in plot.differentialTest that creates a data frame as its own function or to include more information on the result object directly.

The table I ended up with looks like the following:

tax_id diff_abundance std_err p_value p_adjusted variable
bryandmartin commented 3 years ago

Thanks, this is a great and very reasonable feature request. I'll implement this soon.

bryandmartin commented 3 years ago

Hi @Midnighter ,

Thanks for your patience on this, I'm looking at it now. Could you please let me know what you are looking for in the diff_abundance column here?

Midnighter commented 3 years ago

A log fold change would be most intuitive, I think. Or some other measure of effect size.

kalen-rasmussen commented 2 years ago

I was wondering if a function was ever implemented into corncob to extract a data frame like the one outlined above? The plot.differentialTest function does not seem to be implemented in my version 0.2.0 currently and I couldn't find a function that seemed correct in the vignette.

jorondo1 commented 1 year ago

I am trying to extract the adjusted p values associated with each model's taxa. The Pr(>|t|) value in the models doesn't seem to match either the $p or $p_fdr associated with my taxa so I'm not sure I'm looking at the right thing.

I feel like the p values are not coherent here, but I'm not sure if I'm reading this right? Sorry if that's actually a separate issue, however @Midnighter 's request would definitely be useful and solve this for me.

> CnT1_DA$significant_models[[1]]

Call:
bbdml(formula = formula_i, phi.formula = phi.formula, data = data_i, 
    link = link, phi.link = phi.link, inits = inits)

Coefficients associated with abundance:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)  148.603     49.946   2.975  0.00347 **
CnT1         -21.511      6.974  -3.084  0.00247 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Coefficients associated with dispersion:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)  181.238     58.838   3.080  0.00251 **
CnT1         -25.833      8.217  -3.144  0.00205 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Log-likelihood: -985.39
> (tax <- CnT1_DA$significant_taxa)
 [1] "Akkermansia"       "Bacteroides"       "Blautia"           "Butyrivibrio"     
 [5] "Clostridium"       "Coprobacter"       "Enorma"            "Eubacterium"      
 [9] "Fusobacterium"     "Gardnerella"       "Hungatella"        "Klebsiella"       
[13] "Lachnoclostridium" "Morganella"        "Oscillibacter"     "Proteus"          
[17] "Roseburia"         "Sellimonas"       

> CnT1_DA$p_fdr[tax[1]]
Akkermansia 
 0.03104955 

> CnT1_DA$p[tax[1]]
Akkermansia 
0.006733638