proksee-project / proksee-cmd

Repo for Proksee Cmd Line Tools
Apache License 2.0
21 stars 2 forks source link

v1.0.0a5: Various changes to support integration with Proksee Web #77

Closed emarinier closed 2 years ago

emarinier commented 2 years ago

"Older" Changes

Recent Additions

Description of Evaluation Process

There are three evaluations performed by Proksee [command/assemble/evaluate]: a species-based heuristic evaluation; an NCBI RefSeq exclusion criteria-based heuristic evaluation, and a species-based machine learning evaluation.

The species-based heuristic evaluation works by comparing common assembly quality metrics (number of contigs, length, N50, and L50) against a database of curated assembly quality metrics derived from NCBI RefSeq assemblies. If the species is determined with confidence, the evaluation will check to see if each quality metric of the Proksee-pipeline generated assembly falls within an acceptable percentile range when compared to other curated assemblies of the same species.

The NCBI RefSeq exclusion criteria-based heuristic evaluation works be comparing common assembly quality metrics (number of contigs, length, N50, and L50) against RefSeq's exclusion criteria. That is, if the assembly metrics don't meet the thresholds specified by RefSeq, then they will not be accepted into RefSeq.

The species-based machine learning evaluation performs very similarly to our species-based heuristic evaluation, except the assembly quality metrics are considered simultaneously in a machine learning context, rather than evaluating each metric individually.

Won't Do

Example JSON Files

(Updated 2022-10-13): Please note that they're saved as txt in order to upload to GitHub, but they're all assembly_info.json files.

staph_aureus.txt ERR234657.txt campy.txt

emarinier commented 2 years ago

@sciguy I think I've made all the requested changes, and I've also changed the JSON file to be camel case. Take a look and let me know if there's anything else missing you can think of, or anything you think needs to be changed.