Closed AbhishakeL closed 6 years ago
I don't have information on how all of these files were built (all of them except the constraint alignment were used in the original PICRUSt paper), but I can explain what they contain:
gg_13_5_img_16S_counts.txt
- counts of 16S copies per reference genome (where the reference genomes are only those that overlap with Greengenes)gg_13_5_img_fixed.txt
- mapfile of Greengenes ids to IMG genome ids. gg_13_5_img_subset.fasta
- 99 OTUs from Greengenes - only those overlapping with IMG.img_400_ko.tab
- KEGG ortholog abundances in IMG genomes (note that not all of these overlap with IMG)99_otus_IMG_pruned_no_names_constraint.txt
- constraint alignment of Greengenes OTUs overlapping with IMG genomes, which can be used with FastTree to keep a certain core topology. This was made by following the instructions here: http://meta.microbesonline.org/fasttree/constrained.htmlHopefully that helps!
Thanks a lot Gavin. Actually, I have got access to the latest KEGG database and thus thought of updating the database before running PICRUSt. I have got some hint at this thread https://groups.google.com/forum/?hl=en#!starred/picrust-users/0y7RSOMsm1o but I am still trying ti figure out the other files. Is there any SOP you know about?
That link isn't working for me unfortunately. I don't know of any SOP to make those files, sorry! I am working on a beta version of PICRUSt2 which has updated genomes and functions from IMG just so you know. PICRUSt2 is available here and is still being actively developed: https://github.com/picrust/picrust2
Thank you for the "PICRUSt Tutorial with de novo Variants" tutorial. Can you please explain a little how the files in the "img_gg_starting_files" folder were built"?