chapmanb / bcbio.variation

Toolkit to analyze genomic variation data, built on the GATK with Clojure
66 stars 15 forks source link

bcbio.variation-0.2.1 standalone errors #26

Open ssaif opened 9 years ago

ssaif commented 9 years ago

I downloaded the most recent standalone version of bcbio.variation (https://github.com/chapmanb/bcbio.variation) in order to generate a summary of concordance between two call sets and encountering some issues,

I ran the command as per the usage on page https://github.com/chapmanb/bcbio.variation, and getting errrors pointing to the vcf file formatting.

Command - java -jar /group/ngs/src/bcbio-utils/bcbio.variation-0.2.1-standalone.jar variant-utils comparetwo /ngs/oncology/analysis/dev/Dev_0071_IDT_RR_AA/141216_IDT_FFPE_16ng_5_AA/141216_D00443_0100_BHAJC1ADXX/bcbio/final/NA12878/var/NA12878-vardict.vcf /ngs/oncology/datasets/external/EXT_001_NA12878/GIAB/GiaB_NIST_v2.17/GiaB_NIST_v2.17_hg19.vcf /ngs/reference_data/genomes/Hsapiens/hg19/seq/hg19.fa /ngs/reference_data/genomes/Hsapiens/hg19/bed/Xgen-PanCancer.bed

Error - Exception in thread "main" htsjdk.tribble.TribbleException$MalformedFeatureFile: Unable to parse header with error: Your input file has a malformed header: We never saw the required CHROM header line (starting with one #) for the input VCF file, for input source: /ngs/oncology/datasets/external/EXT_001_NA12878/GIAB/GiaB_NIST_v2.17/GiaB_NIST_v2.17_hg19.vcf

I ran bcftools on this vcf and get, bcftools view /ngs/oncology/datasets/external/EXT_001_NA12878/GIAB/GiaB_NIST_v2.17/GiaB_NIST_v2.17_hg19.vcf

fileformat=VCFv4.1

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT

[bcf_sync] incorrect number of fields (0 != 5) at 0:0

I see the same for the vcf with my calls.

Thanks for your help, Sakina

chapmanb commented 9 years ago

Sakina; It sounds like something is problematic with your Genome in a Bottle file since both GATK and bcftools are complaining about it. You could use the one we distribute with bcbio:

https://s3.amazonaws.com/bcbio/giab/GiaB_NIST_RTG_v0_2.vcf.gz https://s3.amazonaws.com/bcbio/giab/GiaB_NIST_RTG_v0_2.vcf.gz.tbi https://s3.amazonaws.com/bcbio/giab/GiaB_NIST_RTG_v0_2_regions.bed

to see if that helps at all with the issue. Hope this helps some.