Let's tackle differential abundance as it will be one of the priorities for this work. I suggest starting to look at the R code starting from this line of one of my past projects, where the fitZIG function of the metagenomeSeq package was used. It is fitting as Zero-Infalted Gaussian model to find differential abundance. We worked with scaled data for this, and we had separated taxon data (e.g. what was classified as genus only) and clade data (e.g. what was classified as genus and species).
What are the contrasts (i.e the pairwise comparisons)? For example, unexposed vs. exposed?
What is our random effect variable. For example, it could be time point? Or something else? In our AMR studies we used Location as random effect variable for some analyses comparing sample type, and no random effect when we compared Locations.
If we are going to use only bacteria information for this. I assume so, and this is what we did in our past study (we had filtered for Bacteria I believe)
In that past work we used scaled data, but we had separated taxon data (e.g. what was classified as genus only) and clade data (e.g. what was classified as genus and species).
Let's tackle differential abundance as it will be one of the priorities for this work. I suggest starting to look at the R code starting from this line of one of my past projects, where the
fitZIG
function of themetagenomeSeq
package was used. It is fitting as Zero-Infalted Gaussian model to find differential abundance. We worked with scaled data for this, and we had separated taxon data (e.g. what was classified as genus only) and clade data (e.g. what was classified as genus and species).https://github.com/ropolomx/one_health_continuum/blob/e240e583f90319d8edbd870910860ab5e6083b5e/bcrc_comparative_analysis.R#L2194