CGATOxford / UMI-tools

Tools for handling Unique Molecular Identifiers in NGS data sets
MIT License
493 stars 190 forks source link

Minor confusion regarding help message #572

Closed kalavattam closed 8 months ago

kalavattam commented 1 year ago

Thanks for developing and maintaining umi_tools, which is really useful for my research.

A minor note/question: If you see, for example line 888 of dedup.py, an option to 'correct' is noted; is this option acted on by the code? As far as I can tell, it's not (discovered this while carefully studying all parameters in order to understand how to correctly call the tool for my work). I see 'correct' listed with some other parameters beyond this example too. Admittedly, haven't run a subcommand to directly check this. Thanks.

TomSmithCGAT commented 1 year ago

Hi Kris,

The help text for these three arguments has a typo! The available options given are ("discard", "use", "output"), but the help messages all say Options are 'discard', 'use' or 'correct'. https://github.com/CGATOxford/UMI-tools/blob/c3ead0792ad590822ca72239ef01b8e559802da9/umi_tools/Utilities.py#L888-L907

As it happens, the output option for these three arguments is only available with group. That's stated in the online help (https://umi-tools.readthedocs.io/en/latest/reference/dedup.html), but not in the command line help for dedup. It's a quirk of the arguments being added alongside other arguments which are common to dedup and group.

On reflection, these arguments should be added separately for dedup and group so that the help text is clearer. I'll add that to the to-do for whenever the heck we get around to another release...