bioforensics / yeat

YEAT: Your Everyday Assembly Tool
Other
1 stars 0 forks source link

Improved auto downsampling with custom coverage and avg read length #23

Closed danejo3 closed 1 year ago

danejo3 commented 1 year ago

The purpose of this PR is to resolve #17 and improve the downsample rule even further in #21 .

In a recent PR (#21), the downsample flag was added to allow users to input their own custom downsample number instead of the auto-calculated value determined by YEAT.

In this PR, a new coverage flag was added to enable users to input their desired coverage. By default, C=150 (x150).

In addition to the new flag, originally, YEAT would hard code the average read length to 250. Because read lengths can vary, due to various reasons such as trimming, a hard-coded value for average length was inappropriate. Because YEAT already runs fastp on all fastq files to preprocess all the reads, we can get the average read length for each file--before and after filtering--in fastp.json.

danejo3 commented 1 year ago

PR is ready for review. New coverage flag was added and no more hard-coded read lengths. Let me know if you have any questions or concerns. Thanks!

danejo3 commented 1 year ago

Edits and comments were made. Let me know what you guys think. Thanks!