KwanLab / Autometa

Autometa: Automated Extraction of Genomes from Shotgun Metagenomes
https://autometa.readthedocs.io
Other
40 stars 15 forks source link

📝 Modify documentation for benchmark.rst #189

Closed Sidduppal closed 2 years ago

Sidduppal commented 3 years ago

📝 Add docs for classification and clustering-classification 📝 Add docs to run multiple results of the same community at once 🔥 Remove code to aggregate results

chasemc commented 3 years ago

I think it may be helpful to throw an example into Jupyter notebook and link to it (or https://docs.readthedocs.io/en/stable/guides/jupyter.html) Especially since the tutorial jumps around between running things on command line and python

chasemc commented 3 years ago

I think the document would benefit from a bit of rearrangement Including: Autometa Test Datasets should be combined with Downloading Test Datasets and should be in the same location as Downloading Test Datasets. Titles:

Move the heading Example benchmarking with simulated communities to directly before the heading Benchmark clustering


Make the examples for each type of clustering clearer by presenting them in the same context :

The example should be simple, so I would suggest presenting only with the context of running on a single sample/community size.

An "Advanced" section could discuss how to handle running on multiple samples.


Generally, the commands/code need more description. As it stands now it's still unclear what commands to run if I want to benchmark (Is following the section Benchmark clustering-classification the same as running Benchmark clustering and Benchmark classification separately?) Possibly could be done with more descriptive headings (e.g. Benchmark clustering becomes Benchmark Binning Results; Benchmark classification becomes Benchmark taxonomic assignments), etc

Aggregate results across simulated communities seems to be just data handling? If just data-handling, it makes the documentation less clear; remove or place in an "Advanced" section or similar at end.

Flags such as --output-long and --output-classification-reports aren't described/defined Inputs aren't described, e.g. how is the input file for --predictions, --reference, etc supposed to be structured?

The documentation is missing discussion of what autometa-benchmark actually does, and the results it produces.

evanroyrees commented 2 years ago

Closing this for now. Please submit a new PR from a new KwanLab/Autometa branch with any changes still necessary.

Addressed a few comments in https://github.com/KwanLab/Autometa/pull/215