honzee / RNAseqCNV

R package for large-scale CNV analysis from RNA-seq
MIT License
11 stars 8 forks source link

Incorrect vcf file format. No allele depth (AD) in FORMAT column #22

Closed JAYRJPT closed 1 year ago

JAYRJPT commented 1 year ago

Hello Jan, I have generated the VCF files using GATK pipeline for RNAseqCNV analysis. However, one of my vcf file showing error in RNAseqCNV tool. Here is my command and error. > RNAseqCNV_wrapper(config = "/media/deepak/jay_doc/MERCILENA/RNASEQCNV_F_G_H/RNASEQCNV/config_file", metadata = "/media/deepak/jay_doc/MERCILENA/RNASEQCNV_F_G_H/RNASEQCNV/snv_metadata.csv", snv_format = "vcf", batch = FALSE, genome_version = "hg38") [1] "Analysis initiated" [1] "Normalization for sample: F9 completed" [1] "Preparing file with snv information for: F9" Reading in vcf file.. Incorrect vcf file format. No allele depth (AD) in FORMAT column

I have checked the file and it look similar with the other which ran without any error in the tool. Kindly look at the vcf file

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  F9
chr1    14579   .   G   C,<NON_REF> 0.01    my_filter1  DP=72;MLEAC=0,0;MLEAF=NaN,NaN;RAW_MQandDP=259200,72 GT:PGT:PID:PS   .|.:0|1:14579_G_C:14579
chr1    14604   .   A   G,<NON_REF> 52.31   my_filter1  DP=2;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;RAW_MQandDP=7200,2 GT:AD:DP:GQ:PGT:PID:PL:PS:SB    1|1:0,2,0:2:6:0|1:14579_G_C:64,6,0,64,6,64:14579:0,0,2,0
chr1    14610   .   T   C,<NON_REF> 95.83   my_filter1  DP=4;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;RAW_MQandDP=14400,4    GT:AD:DP:GQ:PGT:PID:PL:PS:SB    1|1:0,3,0:3:9:0|1:14579_G_C:109,9,0,109,9,109:14579:0,0,3,0
chr1    14653   .   C   T,<NON_REF> 654.64  my_filter1  BaseQRankSum=0.021;DP=40;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=0.000;RAW_MQandDP=144000,40;ReadPosRankSum=-0.157    GT:AD:DP:GQ:PL:SB   0/1:14,26,0:40:99:662,0,250,704,327,1031:8,6,15,11
chr1    14677   .   G   A,<NON_REF> 68.64   my_filter1  BaseQRankSum=1.603;DP=67;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=0.000;RAW_MQandDP=241200,67;ReadPosRankSum=-0.026    GT:AD:DP:GQ:PL:SB   0/1:57,10,0:67:76:76,0,1520,247,1550,1797:30,27,6,4
chr1    15240   .   G   GGGGCCA,<NON_REF>   70.60   my_filter1  BaseQRankSum=0.000;DP=4;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=0.000;RAW_MQandDP=14400,4;ReadPosRankSum=-0.674   GT:AD:DP:GQ:PL:SB   0/1:2,2,0:4:78:78,0,78,84,84,168:1,1,1,1
chr1    15274   .   A   G,<NON_REF> 85.13   my_filter1  DP=4;ExcessHet=3.0103;MLEAC=2,0;MLEAF=1.00,0.00;RAW_MQandDP=14400,4 GT:AD:DP:GQ:PL:SB   1/1:0,4,0:4:12:99,12,0,99,12,99:0,0,2,2
chr1    16111   .   T   C,<NON_REF> 314.64  my_filter1  BaseQRankSum=4.451;DP=57;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=0.000;RAW_MQandDP=205200,57;ReadPosRankSum=-4.470    GT:AD:DP:GQ:PL:SB   0/1:39,14,0:53:99:322,0,899,438,941,1379:24,15,14,0
chr1    16124   .   C   T,<NON_REF> 0   my_filter1  BaseQRankSum=1.035;DP=59;ExcessHet=3.0103;MLEAC=0,0;MLEAF=0.00,0.00;MQRankSum=0.000;RAW_MQandDP=212400,59;ReadPosRankSum=-0.815 GT:AD:DP:GQ:PL:SB   0/0:53,4,0:57:39:0,39,2093,159,2105,2226:34,19,3,1

Thanks and Regards, Jay

Kaddea commented 1 year ago

Hey Jay, I've solved this issue by renaming the chromosomes via 'bcftools annotate --rename-chrs RenameTable.txt old.vcf > new.vcf'. The table for renaming the chromosomes only consists of two space-separated columns with the 'old' names and the new names (without header). chr1 1 chr2 2 etc ...

Hope it helps, Mathias

JAYRJPT commented 1 year ago

Hey Jay, I've solved this issue by renaming the chromosomes via 'bcftools annotate --rename-chrs RenameTable.txt old.vcf > new.vcf'. The table for renaming the chromosomes only consists of two space-separated columns with the 'old' names and the new names (without header). chr1 1 chr2 2 etc ...

Hope it helps, Mathias

Hi Mathias, Thanks for your suggestion. My issue is resolved. I have removed the first line of the vcf file in which AD information was not available.