UPHL-BioNGS / Grandeur

UPHL's Reference Free Pipeline
GNU General Public License v3.0
23 stars 7 forks source link

Add 'predicted organism' column to final summary file #81

Closed erinyoung closed 1 year ago

erinyoung commented 1 year ago

This is mostly because of E. coli and Shigella overlaps, so it may be helpful to have a column that predicts what the organism sequenced is instead of just listing fastani, mash, blobtools, kraken2, and mlst results.

erinyoung commented 1 year ago

Two columns for the blobtools and kraken2 top organism were included, but this is still a work in progress.

There are currently several tools to determine organism, and each one can use a different database. They are

Mash and fastani are required by the workflow, but matches using those tools should not be required.

As I was working on it, I decided that there were a lot of factors that needed to be taken into account, so this idea has been passed by in https://github.com/UPHL-BioNGS/Grandeur/pull/84

erinyoung commented 1 year ago

I'm closing this. There are too many databases and the end user is going to need which one to prioritize.