Closed jwdebelius closed 7 years ago
I agree! We're you able to figure this out? @jwdebelius
Maybe I did not properly ask the question in my other posting. But I'd like to know to convert or change the observation ids which contain deblurred sequences )mine are 350 bp) to taxa. Is this possible?
I didn't end up getting it included. @amnona, @wasade: could I do this be done with a conversion script that hashes the sequences, and then use then sequences as metadata?
Would this have further reaching consequences for tree building?
I hope they answer because this has left me completely stumped. I have looked around to figure out how to visualize the deblurred sequences and show the taxonomy but there either aren't many papers that use deblur or there methodology is very vague. I was able to run assign_taxonomy.py to determine what the sequences are using greengenes but I was unable to figure out how to actually use the taxonomic assignments .txt output. Usually the taxonomy assignments are used along with pick_otus.py to build an OTU table. But here deblur already produces an OTU table...the major problem being that the observation ids are just sequences (I understand why they are...just they don't tell me anything).
Hey @slvrshot
The way to add the taxonomy to the table is as follows. First you run assign_taxonomy.py to generate the .txt with the taxonomy assignments. Then you need to run "biom add-metadata" to add this information to the BIOM table. This will leave you with a BIOM table in the same way that Qiime1 generates the BIOM table in the pick OTUs workflows (these are the steps that Qiime is executing behind the scenes). As an example of the commands that I run:
assign_taxonomy.py -i ${deblur_fna} -o ${taxa_out} -m sortmerna --sortmerna_threads 31
biom add-metadata -i ${deblur_biom} -o ${deblur_out}/final.with.tax.biom --observation-metadata-fp ${taxa_out}/final.seqs_tax_assignments.txt --observation-header OTUID,taxonomy --sc-separated taxonomy
After these 2 commands, the table "final.with.tax.biom" is ready to be used in QIIME1 as any other table generated by the OTU picking workflows.
Hope this helps!
@slvrshot the manuscript was in press for a while but just came out today. The recommended use of Deblur is through q2-deblur
, which automatically hashes the feature labels. It acts as a drop in replacement for q2-dada2 within a typical QIIME2 workflow (e.g., the MVP tutorial). With q2-deblur, I'm pushing in notes right now on its install, but in brief, it can be done with conda install -c biocore q2-deblur
.
I'm closing this issue as it is resolved upstream of Deblur.
It would be helpful if the sequences were treated as metadata, and there was an ID associated with the sequence. The [dada2 plugin for Qiime 2 using a md5 hash(https://github.com/qiime2/q2-dada2/blob/master/q2_dada2/_denoise.py), line 36.
The 150 character sequences are unwieldy for things like dataframes and difficult to compare and identify.