bioforensics / yeat

YEAT: Your Everyday Assembly Tool
Other
1 stars 0 forks source link

Providing genome size for each sample and clarify downsample flag #75

Closed danejo3 closed 2 months ago

danejo3 commented 3 months ago

YEAT has flags to customize the downsampling step. One of the flags that can be used is the --genome-size flag. When a config file has multiple samples, users cannot apply the known genome size of each sample. Instead, the provided genome size is applied to all samples. This can be problematic if the known sizes are different for each sample. One way to resolve this is by adding a genome-size key for each sample in the config.

Continuing the conversation with genome size, the help message in YEAT for --genome-size and --downsample needs further clarification. If a user provides a sample's genome size, downsampling will need to be calculated. However, if downsampling is provided with the --downsample flag, there is no need to use the calculated genome size from mash and recalculate the down. This logic is coded correctly in the Shared snakemake file under rule downsample. In YEAT's help message, this idea is not clear and probably should state somewhere that users shouldn't use both the --downsample and --genome-size flags because --downsample will disregard --genome-size.