samtools / bcftools

This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html
http://samtools.github.io/bcftools/
Other
673 stars 240 forks source link

[bcftools mpileup --samples-file] how to rename old sample names if they are not unique? #2022

Closed mlin2017 closed 1 year ago

mlin2017 commented 1 year ago

Hi @pd3,

Does bcftools mpileup support renaming old samples when the sample names are not unique?

It seems that --read-groups enables such change by adding a bam file in this format "read_group_id file_name sample_name". Maybe adapting the same strategy to --samples-file would be able to handle non-unique old sample names? Not sure if that makes sense. Meanwhile, are there any other workarounds other than changing the SM names in bam files first?

Thanks!

pd3 commented 1 year ago

I don't understand completely; why not use --read-groups as it already supports the sample renaming? Or is it not working? The documentation gives a description how to use it http://samtools.github.io/bcftools/bcftools.html#mpileup

mlin2017 commented 1 year ago

Thanks @pd3! It should work; it just requires an additional step to get all read groups from each bam file first. And it would be more convenient to provide a file for -S, --samples-file in a format similar to that in -G, --read-groups, with three columns to differentiate duplicate sample names.

pd3 commented 1 year ago

It is trivial to obtain the list of read groups and sample names from a bam file, and can be done with a simple shell one-liner. I am afraid this is not something we want to support, the benefits of user convenience does not outweigh the amount of required work and the increased code complexity.