Open gregcaporaso opened 11 years ago
Also, this would be great a thing to work on for someone who is new to QIIME development, and looking for a challenge that is bigger than something tagged as "quick fix", but still relatively stand-alone.
Anyone interested in working on this for QIIME 1.8.0?
I think this script can be classified as an easy intro to QIIME development, in case someone is interested, perhaps: @cuttlefishh @JWDebelius @amnona
We should also allow for % of the sequences from the input file.
To help with debugging weird sequencing results, we should develop a new function that BLASTs randomly selected sequences from a fasta file against nr and creates a graphical summary accessible via an html file. This should work in a couple of different modes: (1) by selecting sequences completely at random from fasta or fastq (so it would work prior to demultiplexing), and (2) by selecting n sequences from all samples in a file after demultiplexing.
This script should work by BLASTing against NCBI (use
cogent.db.ncbi.EUtils
). This will require a network connection, so the code should fail gracefully if there is not a network connection. This is preferential to requiring that the user always have a recent version of nr installed locally. For subsampling,qiime.util.subsample_fasta
will be helpful, and it would be great to expand that function to support mode (2).After creating this function, we'd likely hooking mode (2) up to
core_diversity_analyses.py
, but skipping if there is not an active internet connection.