transcript / samsa2

SAMSA pipeline, version 2.0. An open-source metatranscriptomics pipeline for analyzing microbiome data, built around DIAMOND and customizable reference databases.
GNU General Public License v3.0
54 stars 36 forks source link

diversity_stats.R #7

Closed seb951 closed 6 years ago

seb951 commented 6 years ago

Currently, L109-114 only make sense if you have 12 experimental and 12 control samples. It would make more sense to have a more general parsing such as: divShannon_exp <- mean(diversity(flipped_complete_table[grep("exp",rownames(flipped_complete_table)),], index = "shannon"))

You just need to make sure your sample names contain "exp" and "control" in their names (note that I also had to modify the "GET FILE NAME" section, because currently it does not parse the information properly).

Finally, if you want this script to work with the Subsystems_results directory, you should modify Lines 66, 70,84,88 to add a "fill = T" argument (eg:

control_table <- read.table(file = x, header = F, quote = "", sep = "\t",fill = T)

This would take care of rows that do not have a hierarchy ("NO HIERARCHY"). This is probably true in other scripts too.

transcript commented 6 years ago

Updated and fixed, thank you!