At the moment, MicroHapulator users are expected to invoke the entire 4+ stage analysis pipeline in a step-wise fashion. While it will always be important to support step-wise execution, it would be very helpful to provide a standard automated pipeline that can be invoked with a single command. The current docs and demo provide the basis of that pipeline, but performing a bit of additional QA/QC would be warranted as well. Here is what I see as the contours of that pipeline.
[x] Snakemake workflow implementation
[x] validate marker references and marker definitions
[x] core pipeline: merge, map, type, filter
[x] QA/QC
[x] plot read length distribution of unmerged paired-end reads (#125)
[x] merging rates
[x] plot read length distribution of merged reads (#125)
[x] mapping rates vs MH references
[x] mapping rates vs chromosome (GRCh38) references (#116, #124)
[x] typing rates (#126)
[x] plot interlocus balance
[x] plot heterozygote balance
[x] Sample name driven FASTQ file discovery a la MAnaT
At the moment, MicroHapulator users are expected to invoke the entire 4+ stage analysis pipeline in a step-wise fashion. While it will always be important to support step-wise execution, it would be very helpful to provide a standard automated pipeline that can be invoked with a single command. The current docs and demo provide the basis of that pipeline, but performing a bit of additional QA/QC would be warranted as well. Here is what I see as the contours of that pipeline.