samtools / bcftools

This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html
http://samtools.github.io/bcftools/
Other
663 stars 240 forks source link

[mpileup] 1 samples in 12 input files #2115

Closed xiadawei123 closed 6 months ago

xiadawei123 commented 7 months ago

Hi, I am using bcftools for snp calling, but why do 12 bam files show only one sample file entered,as shown below.

$ bcftools mpileup -a AD,ADF,ADR,DP,SP,INFO/AD,INFO/ADF,INFO/ADR -b specieal.txt -f ../../Bins_modified.fasta -q 20 -Q 20 -O u --threads 20 | bcftools call -c -v -O z --threads 20 -o chloroplast.vcf.gz Note: none of --samples-file, --ploidy or --ploidy-file given, assuming all sites are diploid [mpileup] 1 samples in 12 input files [mpileup] maximum number of reads per input file set to -d 250

pd3 commented 6 months ago

The sample names are inferred from the information contained in the BAM files, specifically read groups (see the @RG header lines and SM subfield in the SAM specification https://samtools.github.io/hts-specs/SAMv1.pdf).

One sample can be contained in multiple BAMs and one BAM can have multiple samples, and the program attempts to match them all correctly by the read group and SM name. The behavior can be fine tuned with the bcftools mpileup -G option, see the description in the manual page http://samtools.github.io/bcftools/bcftools.html#mpileup

*-G, --read-groups* [^]'FILE'::
    list of read groups to include or exclude if prefixed with "^".
    One read group per line.  This file can also be used to assign new sample
    names to read groups by giving the new sample name as a second
    white-space-separated field, like this: "read_group_id new_sample_name".
    If the read group name is not unique, also the bam file name can
    be included: "read_group_id file_name sample_name".  If all
    reads from the alignment file should be treated as a single sample, the
    asterisk symbol can be used: "* file_name sample_name". Alignments without
    a read group ID can be matched with "?". *NOTE:* The meaning of *bcftools mpileup -G*
    is the opposite of *samtools mpileup -G*.
----
    RG_ID_1
    RG_ID_2  SAMPLE_A
    RG_ID_3  SAMPLE_A
    RG_ID_4  SAMPLE_B
    RG_ID_5  FILE_1.bam  SAMPLE_A
    RG_ID_6  FILE_2.bam  SAMPLE_A
    *        FILE_3.bam  SAMPLE_C
    ?        FILE_3.bam  SAMPLE_D
----