reneshbedre / bioinfokit

Bioinformatics data analysis and visualization toolkit
MIT License
334 stars 77 forks source link

The VCF combine chromosomes operation should be callled "concat", not "merge" #9

Closed RamRS closed 4 years ago

RamRS commented 4 years ago

It is common convention across popular tools such as the VCFtools PERL library and bcftools to differentiate between

  1. combining chromosome-specific VCF files into a single VCF file; and
  2. combining single sample VCF files into one multi-sample VCF file

The former is called concat and the latter, merge. However, in your tookit, you call the former merge, which is misleading. Plus, given that your README does not describe what the operation actually does, one needs to dig deep to understand what's going on.

Please rename the operation and add a line in the README addressing this.

reneshbedre commented 4 years ago

Hi @RamRS,

Thank you for your recommendations. I would consider changing the function name from merge to concat to keep with the conventional names.

reneshbedre commented 4 years ago

This issue has been fixed in v0.9.4

RamRS commented 4 years ago

That was really quick! If you consider switching to pysam to read high throughput files (BAM, VCF, etc), let me know - I'm willing to contribute.

reneshbedre commented 4 years ago

@RamRS ,

Thank you for showing an interest in bioinfokit. It would be a good idea to use pysam, but it is highly dependent on other non-Python tools. If you have something that is in Python and adds value to bioinfokit, you are always welcome to contribute.

Thank you.