Closed jingydz closed 1 year ago
Hi, yes CNVnator works with any genome in the. bam header. There is probably an error with conversion script to VCF. The coordinates are the same as in the original bam.
Mayo Clinic, 200 1st street SW, Harwick 7-91 Rochester, MN 55905 www.abyzovlab.org tel: +1-(507)-538-0978
"CNVnator works with any genome in the .bam header" Why the introduction of CNVnator software shows that "Valid genomes (-genome option) are: NCBI36, hg18, GRCh37, hg19, mm9"? there are no "hg38"?
My bam header: @RG ID:6725D LB:6725D SM:6725D PL:ILLUMINA @PG ID:GATK IndelRealigner VN:nightly-2017-07-03-g1f763d5 CL:knownAlleles=[(RodBinding name=knownAlleles source=/home/work01/WGS/anno/hg38/v0/Mills_and_1000G_gold_standard.indels.hg38.vcf)] targetIntervals=/home/work01/WGS/gatkout/6725D/6725D.realignertargetc @PG ID:MarkDuplicates VN:2.9.2-SNAPSHOT CL:picard.sam.markduplicates.MarkDuplicates INPUT=[/home/work01/WGS/gatkout/6725D/6725D.bam] OUTPUT=/home/work01/WGS/gatkout/6725D/6725D.marked.bam METRICS_FILE=/home/work01/WGS/gatkout/6725D/6725D.marked.metr @PG ID:bwa PN:bwa VN:0.7.15-r1140 CL:/home/work01/tools/bwa-0.7.15/bwa mem -M -t 12 -R @RG\tID:6725D\tLB:6725D\tSM:6725D\tPL:ILLUMINA /home/work01/WGS/anno/hg38/v0/Homo_sapiens_assembly38.fasta.64 /home/work01/WGS/cleandata/6725D/6725D_R1_clean_paired.fq /hom @PG ID:GATK PrintReads VN:nightly-2017-07-03-g1f763d5 CL:readGroup=null platform=null number=-1 sample_file=[] sample_name=[] simplify=false no_pg_tag=false
I just very confused about why the header of my output vcf file is "1000GenomesPhase3_decoy-GRCh37", which step caused this?
Hi, sorry for the confusion. When dealing with SAM file one have to specify genome. For BAM files there is no need for that.
Mayo Clinic, 200 1st street SW, Harwick 7-91 Rochester, MN 55905 www.abyzovlab.org tel: +1-(507)-538-0978
On Jun 26, 2023, at 9:56 PM, Zhang Jingjing @.***> wrote:
"CNVnator works with any genome in the .bam header" Why the introduction of CNVnator software shows that "Valid genomes (-genome option) are: NCBI36, hg18, GRCh37, hg19, mm9"? there are no "hg38"?
My bam header: @rghttps://github.com/rg ID:6725D LB:6725D SM:6725D PL:ILLUMINA @pghttps://github.com/pg ID:GATK IndelRealigner VN:nightly-2017-07-03-g1f763d5 CL:knownAlleles=[(RodBinding name=knownAlleles source=/home/work01/WGS/anno/hg38/v0/Mills_and_1000G_gold_standard.indels.hg38.vcf)] targetIntervals=/home/work01/WGS/gatkout/6725D/6725D.realignertargetc @pghttps://github.com/pg ID:MarkDuplicates VN:2.9.2-SNAPSHOT CL:picard.sam.markduplicates.MarkDuplicates INPUT=[/home/work01/WGS/gatkout/6725D/6725D.bam] OUTPUT=/home/work01/WGS/gatkout/6725D/6725D.marked.bam METRICS_FILE=/home/work01/WGS/gatkout/6725D/6725D.marked.metr @pghttps://github.com/pg ID:bwa PN:bwa VN:0.7.15-r1140 CL:/home/work01/tools/bwa-0.7.15/bwa mem -M -t 12 -R @rghttps://github.com/rg\tID:6725D\tLB:6725D\tSM:6725D\tPL:ILLUMINA /home/work01/WGS/anno/hg38/v0/Homo_sapiens_assembly38.fasta.64 /home/work01/WGS/cleandata/6725D/6725D_R1_clean_paired.fq /hom @pghttps://github.com/pg ID:GATK PrintReads VN:nightly-2017-07-03-g1f763d5 CL:readGroup=null platform=null number=-1 sample_file=[] sample_name=[] simplify=false no_pg_tag=false
I just very confused about why the header of my output vcf file is "1000GenomesPhase3_decoy-GRCh37", which step caused this?
— Reply to this email directly, view it on GitHubhttps://github.com/abyzovlab/CNVnator/issues/280#issuecomment-1608664722, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACLKGONIOCW5FS2B52WJTBTXNJDXFANCNFSM6AAAAAAZT4FVUE. You are receiving this because you commented.Message ID: @.***>
If I have already done this to my BAM file (providing the -genome hg38 parameter), will the resulting file ignore the -genome hg38 parameter?
Hi, yes for bam files option -genome is ignore.
Mayo Clinic, 200 1st street SW, Harwick 7-91 Rochester, MN 55905 www.abyzovlab.org tel: +1-(507)-538-0978
Message ID: @.***>
Does CNVnator support hg38?
Command
Valid genomes (-genome option) are: NCBI36, hg18, GRCh37, hg19, mm9
Command
Output
sample.vcf ##fileformat=VCFv4.1 ##fileDate=20230624 ##reference=1000GenomesPhase3_decoy-GRCh37 ##source=CNVnator ...
Question
Why is the header of my VCF file output as GRCh37 when I use hg38? Is it possible that only the header is incorrect and the actual data is still based on hg38?