This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html
Hi, I'm working with a list of VCF files from patients as the starting data for a project. I would like to combine (concatenate) all these files into one VCF, but I'm facing two problems.
Half of the files were generated with Mutect2, and the last two columns contain different sample IDs for each patient. I need to change these IDs to 'NORMAL' and 'TUMOUR' for each file. I'm having trouble figuring out the command to accomplish this.
I would also like to be able to identify from which patient each mutation in the collective VCF file comes from. I read that I can achieve this by adding an INFO tag, but I'm struggling to understand how to implement this.
For both cases, I intend to use bcftools annotate.
Update:
For the 1º problem im using the comand bcftools reheader -s new_samples.txt "$out_dir/$output_vcf" -o "$out_dir/$output_vcf".
It makes the job, but later when i try to manipulate this files it gives me this error:
[E::bgzf_read_block] Invalid BGZF header at offset 36076index: failed to create index for ...
The new_samples.txt file is only this:
NORMAL
TUMOUR
And, when cheeking the modified file, its all right gzip: APGI-AU_DO32825_gatk-mutect2.vcf.gz: decompression OK, trailing garbage ignored
Hi, I'm working with a list of VCF files from patients as the starting data for a project. I would like to combine (concatenate) all these files into one VCF, but I'm facing two problems.
Half of the files were generated with Mutect2, and the last two columns contain different sample IDs for each patient. I need to change these IDs to 'NORMAL' and 'TUMOUR' for each file. I'm having trouble figuring out the command to accomplish this.
I would also like to be able to identify from which patient each mutation in the collective VCF file comes from. I read that I can achieve this by adding an INFO tag, but I'm struggling to understand how to implement this.
For both cases, I intend to use bcftools annotate.
Update: For the 1º problem im using the comand
bcftools reheader -s new_samples.txt "$out_dir/$output_vcf" -o "$out_dir/$output_vcf"
. It makes the job, but later when i try to manipulate this files it gives me this error:[E::bgzf_read_block] Invalid BGZF header at offset 36076
index: failed to create index for ...
The new_samples.txt file is only this:
NORMAL TUMOUR
And, when cheeking the modified file, its all right
gzip: APGI-AU_DO32825_gatk-mutect2.vcf.gz: decompression OK, trailing garbage ignored