fedarko / strainFlye

Pipeline for analyzing (rare) mutations in metagenome-assembled genomes
BSD 3-Clause "New" or "Revised" License
8 stars 1 forks source link

Fully worked out tutorial #10

Closed fedarko closed 1 year ago

fedarko commented 2 years ago

Either a markdown file or a jupyter notebook would probs be best. Maybe walking through the SheepGut dataset? (or a small subset of it)

Should showcase pretty much everything in the pipeline, starting with gfa --> fasta, then align, then calling, ...

fedarko commented 2 years ago

Times taken of each command on the full SheepGut dataset

These are informal benchmarks -- they just give an idea of the order of magnitude of time that each step takes. Will continue filling in this list as things get done.

fedarko commented 2 years ago

Alternatively, add a step to the tutorial (after alignment, before calling) that filters the FASTA file to just long / high-coverage / high-checkm-quality contigs? would make this go faster, and be a more realistic representation of what probably gets done in practice