caporaso-lab / mockrobiota

A public resource for microbiome bioinformatics benchmarking using artificially constructed (i.e., mock) communities.
http://mockrobiota.caporasolab.us
BSD 3-Clause "New" or "Revised" License
77 stars 35 forks source link

Added mock-27 (nifH_even) and mock-28 (nifH_tiered) #84

Closed roey-angel closed 6 years ago

roey-angel commented 6 years ago

fixes #82

roey-angel commented 6 years ago

OK, I've made the suggested changes and pushed them to my repo. Do you see them on the branch? Comments:

  1. For the MiSeq data I've added the SRA url. I assume you then download them to your amazon cloud.
  2. I changed to MiSeq sample-metadata.tsv , but the sequences given under 'source' are based on Sanger.
  3. I've added phylogenetic_classification_CART.tsv which is based on the database from the Zehr lab (probably the most widely used one for the gene). I'd nevertheless like to keep the Silva taxonomy (taxonomy.tsv) because: a. phylogenetic clusters are very broad and not informative enough, b. the mock is based on isolates for which we have 16S data.
roey-angel commented 6 years ago

I made all changes. I changed the name of the CART file from "classification" to "clusters" to clarify the idea. I'm not sure I understand your comment regarding CART. The tool (which is based on the database by Zehr's lab) only returns the phylogenetic cluster of the nifH sequence, not the taxonomic identity of the microbe. In most cases it is impossible to determine the specific taxonomic identity based on nifH sequence.

thermokarst commented 6 years ago

Hi @roey-angel - did you remember to push those changes to your feature branch? We aren't seeing them reflected in the pull request at the moment. Thanks!

roey-angel commented 6 years ago

hi @thermokarst, you should see them now