Closed Sidduppal closed 2 years ago
I think it may be helpful to throw an example into Jupyter notebook and link to it (or https://docs.readthedocs.io/en/stable/guides/jupyter.html) Especially since the tutorial jumps around between running things on command line and python
I think the document would benefit from a bit of rearrangement
Including:
Autometa Test Datasets
should be combined with Downloading Test Datasets
and should be in the same location as Downloading Test Datasets
.
Titles:
Move the heading Example benchmarking with simulated communities
to directly before the heading Benchmark clustering
Make the examples for each type of clustering clearer by presenting them in the same context :
Benchmark clustering
is presented in the context of for-looping over community sizesBenchmark classification
is presented in the context of for-looping over community sizes but the for-loop isn't shownBenchmark clustering-classification
is presented in the context of running only on a single community sizeThe example should be simple, so I would suggest presenting only with the context of running on a single sample/community size.
An "Advanced" section could discuss how to handle running on multiple samples.
Generally, the commands/code need more description. As it stands now it's still unclear what commands to run if I want to benchmark (Is following the section Benchmark clustering-classification
the same as running Benchmark clustering
and Benchmark classification
separately?)
Possibly could be done with more descriptive headings (e.g. Benchmark clustering
becomes Benchmark Binning Results
; Benchmark classification
becomes Benchmark taxonomic assignments
), etc
Aggregate results across simulated communities
seems to be just data handling? If just data-handling, it makes the documentation less clear; remove or place in an "Advanced" section or similar at end.
Flags such as --output-long
and --output-classification-reports
aren't described/defined
Inputs aren't described, e.g. how is the input file for --predictions
, --reference
, etc supposed to be structured?
The documentation is missing discussion of what autometa-benchmark
actually does, and the results it produces.
Closing this for now. Please submit a new PR from a new KwanLab/Autometa branch with any changes still necessary.
Addressed a few comments in https://github.com/KwanLab/Autometa/pull/215
📝 Add docs for classification and clustering-classification 📝 Add docs to run multiple results of the same community at once 🔥 Remove code to aggregate results