Change the baseline - Githubissues

statdivlab / corncob

Count Regression for Correlated Observations with the Beta-binomial

102 stars 22 forks source link

Change the baseline #122

Closed eveistswp closed 2 years ago

eveistswp commented 3 years ago

Hi there,

Firstly thanks for creating this, very timely for me!

I have discrete categories (species) whose gut microbiota I am trying to compare. It seems the default is to pick the first category listed in the variable as the 'baseline' to compare the abundance/variability to. Is there anyway to change this without reformatting the order of my variables in the phyloseq object?

Thanks! Evie

bryandmartin commented 3 years ago

Hi Evie @eveistswp ,

Yes, fortunately this is very simple to do!

I recommend using either base R relevel() or forcats fct_relevel(). Check out the examples in the documentation here! https://forcats.tidyverse.org/reference/fct_relevel.html

Bryan

eveistswp commented 3 years ago

Hi Bryan,

I tried this (acutally, I cheated and split and re-merged the phyloseq objects). However this has not solved the problem. Something seems to be 'choosing' the baseline category to be the one first alphabetically, regardless of the order they are in the OTU table of the phyloseq object. Is there a way to avoid this as re-ordering in the phyloseq object has not solved it?

Thanks, Evie

bryandmartin commented 3 years ago

Hi @eveistswp ,

Thanks for letting me know, re-opening this. Is there any way you would be able to send the line of code that is causing this issue? Either here or privately in an email.

For your sake, a temporary and admittedly hack-y workaround is to just rename the one you want to be baseline to make it first alphabetically. That being said, I still want to fix this, but for that I need to figure out why it's choosing the baseline for you.

Bryan

eveistswp commented 3 years ago

Hi Bryan,

I'm using a phyloseq object as a dataset and modifying the commands from your vignettes etc. (I'm not au fait with modelling, so it's not anything fancy).

An example of the command I am trying to run is: da_analysis <- differentialTest(formula = ~ Sp_Biome, phi.formula = ~ Sp_Biome, formula_null = ~ 1, phi.formula_null = ~ Sp_Biome, test = "LRT", boot = FALSE, data = aridcore, fdr_cutoff = 0.05)

Thanks!

gcuster1991 commented 2 years ago

Hi @eveistswp, I hope you have figured our your issue by now. If not, I used the following to reorder my factors. By default, they plotted in alphabetical order. This should allow you to reorder them in Phyloseq without having to split and remake your object.

sample_data(ps_object)$Factor_of_Interest <- factor(sample_data(ps_object)$Factor_of_Interest, levels = c("Level_2", "Level_3", "Level_5", "Level_1", "Level_4")).

bryandmartin commented 2 years ago

Thank you @gcuster1991 !