audy / richardson-2014

Analysis code for the DIPP microbiome study
http://dipp.utu.fi
0 stars 2 forks source link

Run sequences through QIIME 1.8 #32

Closed audy closed 10 years ago

audy commented 10 years ago

This issue could go in https://github.com/audy/richardson-2014-data but I'm choosing to keep it here.

This recent paper suggests using de novo clustering but that's impossible given the size of the dataset (2 Illumina HiSeq runs).

Trying to match these methods as closely as possible:

Analysis, Optimization and Verification of Illumina-Generated 16S rRNA Gene Amplicon Surveys PLoS One 2014

Prerequisites

What are the quality score formats for the 3 HISEQ runs?

Steps

  1. Demultiplex using in-house scripts
  2. Use pick_otus_closed_ref.py, modified to use usearch_local instead of usearch_global and use --min_query_cov 0.95 to compensate for using local alignments.
  3. Re-assign taxonomies using Kraken and a modified NCBI database (added more Bacteroides genomes).