Run sequences through QIIME 1.8 - Githubissues

audy / richardson-2014

Analysis code for the DIPP microbiome study

http://dipp.utu.fi

0 stars 2 forks source link

Run sequences through QIIME 1.8 #32

Closed audy closed 10 years ago

audy commented 10 years ago

This issue could go in https://github.com/audy/richardson-2014-data but I'm choosing to keep it here.

This recent paper suggests using de novo clustering but that's impossible given the size of the dataset (2 Illumina HiSeq runs).

Trying to match these methods as closely as possible:

Analysis, Optimization and Verification of Illumina-Generated 16S rRNA Gene Amplicon Surveys PLoS One 2014

Prerequisites

[x] QIIME mapping file(s)
[x] Properly-formatted FASTQ files (https://github.com/audy/import-sequences)
[x] HISEQ 1
[x] HISEQ 2
[x] HISEQ 3

What are the quality score formats for the 3 HISEQ runs?

Steps

Demultiplex using in-house scripts
Use pick_otus_closed_ref.py, modified to use usearch_local instead of usearch_global and use --min_query_cov 0.95 to compensate for using local alignments.
Re-assign taxonomies using Kraken and a modified NCBI database (added more Bacteroides genomes).