Ensembl / ensembl-vep

The Ensembl Variant Effect Predictor predicts the functional effects of genomic variants
https://www.ensembl.org/vep
Apache License 2.0
437 stars 150 forks source link

WARNING: Chromosome ? not found in annotation sources or synonyms on line ? #1662

Closed jooY-Jeong closed 2 months ago

jooY-Jeong commented 2 months ago

Describe the issue

Hello, I can't fix the error below. Is this a problem caused by VEP not recognizing chromosome names? I've tried two days to fix this, but I haven't.. I'd appreciate it if you could help me.

Additional information

I used VScode to do the below commands. All the files needed for the vcf2maf command are stored in the VScode terminal. And VEP file is stored in docker images. The reason is that vep is not compatible in my vscode terminal.

To resolve the error, i compared the chromosome name of the VCF file and VEP cache data. Two files look like same as below.


jjy@vetbio3:/home/ext/Extended_02/JJY/WES/9_Genome_MuSic2/MAF/VEP_cache/canis_lupus_familiaris/104_CanFam3.1$ ls 1 8 AAEX03023213.1 AAEX03025054.1 JH373243.1 JH373344.1 JH373451.1 JH373599.1 JH373787.1 JH374072.1 10 9 AAEX03023233.1 AAEX03025101.1 JH373247.1 JH373345.1 JH373453.1 JH373600.1 JH373805.1 JH374073.1 11 AAEX03019307.1 AAEX03023275.1 AAEX03025102.1 JH373249.1 JH373351.1 JH373456.1 JH373601.1 JH373813.1 JH374076.1 12 AAEX03019491.1 AAEX03023286.1 AAEX03025105.1 JH373251.1 JH373362.1 JH373457.1 JH373607.1 JH373817.1 JH374077.1 13 AAEX03019691.1 AAEX03023357.1 AAEX03025132.1 JH373252.1 JH373363.1 JH373461.1 JH373616.1 JH373826.1 JH374091.1 14 AAEX03020054.1 AAEX03023381.1 AAEX03025136.1 JH373253.1 JH373368.1 JH373463.1 JH373619.1 JH373841.1 JH374093.1 ⁝


(vetbio) jjy@vetbio3:/home/ext/Extended_02/JJY/WES/6_strelka2_result/merged_VCF$ grep -v "^#" 4_total.vcf | cut -f1 | uniq 38 X JH373253.1 JH373297.1 JH373416.1 JH373435.1 AAEX03022391.1 JH374139.1 AAEX03025158.1 1 2 3 4 14 15 16 17 MT JH373233.1 JH373234.1 JH373235.1 ⁝


Next, I checked the chr_synonyms.txt file inside the VEP cache data folder.

NW_003727748.1 cont3.24343 AAEX03025784.1 NW_003729067.1 AAEX03013651.1 cont3.13651 AAEX03025432.1 NW_003728724.1 AAEX03018960.1 cont3.18960 AAEX03018800.1 cont3.18800 AAEX03019763.1 cont3.19763 JH373384.1 NW_003726302.1 AAEX03004528.1 cont3.4528 AAEX03010699.1 cont3.10699 AAEX03018406.1 cont3.18406 AAEX03014161.1 cont3.14161 AAEX03026331.1 cont3.26331 AAEX03014944.1 cont3.14944 AAEX03007066.1 cont3.7066 NW_003728169.1 cont3.24837 NW_003727695.1 cont3.24271 AAEX03012270.1 cont3.12270 AAEX03015426.1 cont3.15426 AAEX03009397.1 cont3.9397

System

Full VEP command line

docker run --rm \ -e PERL5LIB=/opt/vep/src/ensembl-vep \ -v /home/ext/Extended_02/JJY/vcf2maf-1.6.21:/data/scripts \ -v /home/ext/Extended_02/JJY/WES:/data/WES \ vep-with-samtools-104.0 perl /data/scripts/vcf2maf.pl \ --input-vcf /data/WES/6_strelka2_result/merged_VCF/4_total.vcf \ --output-maf /data/WES/9_Genome_MuSic2/MAF/MAF/4.maf \ --tumor-id TUMOR --normal-id NORMAL \ --ref-fasta /data/WES/1_CanFam3.1_dog_reference_genome/DNA_extract_fai_dict/Canis_lupus_familiaris.CanFam3.1.dna.toplevel.fa \ --vep-path /opt/vep/src/ensembl-vep \ --vep-data /data/WES/9_Genome_MuSic2/MAF/VEP_cache/canis_lupus_familiaris/104_CanFam3.1 \ --species canis_lupus_familiaris \ --ncbi-build CanFam3.1 \ --cache-version 104 \ --verbose

Full error message

STATUS: Preprocessing /data/WES/6_strelka2_result/merged_VCF/4_total.vcf: split SV breakpoints before passing to VEP... STATUS: Pulling flanking reference bps for checks... STATUS: Splitting loci into smaller chunks to run separately... STATUS: Reporting any problems on variant loci and reference alleles... STATUS: Running VEP and writing to: /data/WES/6_strelka2_result/merged_VCF/4_total.vep.vcf STATUS: Running this VEP command:
/usr/bin/perl /opt/vep/src/ensembl-vep/vep --species \ canis_lupus_familiaris --assembly CanFam3.1 --no_stats --buffer_size \ 5000 --ccds --uniprot --hgvs --symbol --numbers --domains \ --gene_phenotype --canonical --protein --biotype --uniprot --tsl \ --variant_class --shift_hgvs 1 --check_existing --total_length \ --allele_number --no_escape --xref_refseq --failed 1 --vcf \ --flag_pick_allele --pick_order canonical,tsl,biotype,rank,ccds,length \ --dir \ /data/WES/9_Genome_MuSic2/MAF/VEP_cache/canis_lupus_familiaris/104_CanFam3.1 \ --fasta \ /data/WES/1_CanFam3.1_dog_reference_genome/DNA_extract_fai_dict/Canis_lupus_familiaris.CanFam3.1.dna.toplevel.fa \ --format vcf --input_file \ /data/WES/6_strelka2_result/merged_VCF/4_total.vcf --output_file \ /data/WES/6_strelka2_result/merged_VCF/4_total.vep.vcf --offline \ --pubmed --fork 4 --cache_version 104 --regulatory WARNING: Chromosome 1 not found in annotation sources or synonyms on line 1 WARNING: Chromosome 2 not found in annotation sources or synonyms on line 14 WARNING: Chromosome 3 not found in annotation sources or synonyms on line 23 WARNING: Chromosome 4 not found in annotation sources or synonyms on line 30 WARNING: Chromosome 5 not found in annotation sources or synonyms on line 33 WARNING: Chromosome 6 not found in annotation sources or synonyms on line 40 WARNING: Chromosome 7 not found in annotation sources or synonyms on line 49 WARNING: Chromosome 8 not found in annotation sources or synonyms on line 52 WARNING: Chromosome 9 not found in annotation sources or synonyms on line 65 WARNING: Chromosome 10 not found in annotation sources or synonyms on line 75 WARNING: Chromosome 11 not found in annotation sources or synonyms on line 85 WARNING: Chromosome 12 not found in annotation sources or synonyms on line 92 WARNING: Chromosome 13 not found in annotation sources or synonyms on line 97 WARNING: Chromosome 14 not found in annotation sources or synonyms on line 105 WARNING: Chromosome 15 not found in annotation sources or synonyms on line 107 WARNING: Chromosome 16 not found in annotation sources or synonyms on line 113 WARNING: Chromosome 17 not found in annotation sources or synonyms on line 120 WARNING: Chromosome 18 not found in annotation sources or synonyms on line 127 WARNING: Chromosome 19 not found in annotation sources or synonyms on line 140 WARNING: Chromosome 20 not found in annotation sources or synonyms on line 143 WARNING: Chromosome 21 not found in annotation sources or synonyms on line 146 WARNING: Chromosome 22 not found in annotation sources or synonyms on line 148 WARNING: Chromosome 23 not found in annotation sources or synonyms on line 149 WARNING: Chromosome 24 not found in annotation sources or synonyms on line 154 WARNING: Chromosome 25 not found in annotation sources or synonyms on line 161 WARNING: Chromosome 26 not found in annotation sources or synonyms on line 167 WARNING: Chromosome 27 not found in annotation sources or synonyms on line 172 WARNING: Chromosome 28 not found in annotation sources or synonyms on line 174 WARNING: Chromosome 29 not found in annotation sources or synonyms on line 179 WARNING: Chromosome 30 not found in annotation sources or synonyms on line 184 WARNING: Chromosome 31 not found in annotation sources or synonyms on line 189 WARNING: Chromosome 32 not found in annotation sources or synonyms on line 190 WARNING: Chromosome 33 not found in annotation sources or synonyms on line 191 WARNING: Chromosome 34 not found in annotation sources or synonyms on line 192 WARNING: Chromosome 35 not found in annotation sources or synonyms on line 194 WARNING: Chromosome 36 not found in annotation sources or synonyms on line 200 WARNING: Chromosome 37 not found in annotation sources or synonyms on line 204 WARNING: Chromosome 38 not found in annotation sources or synonyms on line 209 WARNING: Chromosome X not found in annotation sources or synonyms on line 212 WARNING: Chromosome JH373253.1 not found in annotation sources or synonyms on line 215 WARNING: Chromosome JH373297.1 not found in annotation sources or synonyms on line 217 WARNING: Chromosome JH373416.1 not found in annotation sources or synonyms on line 218 WARNING: Chromosome JH373435.1 not found in annotation sources or synonyms on line 219 WARNING: Chromosome AAEX03022391.1 not found in annotation sources or synonyms on line 220 WARNING: Chromosome JH374139.1 not found in annotation sources or synonyms on line 221 WARNING: Chromosome AAEX03025158.1 not found in annotation sources or synonyms on line 222 WARNING: Chromosome MT not found in annotation sources or synonyms on line 109577 WARNING: Chromosome JH373233.1 not found in annotation sources or synonyms on line 109612 WARNING: Chromosome JH373234.1 not found in annotation sources or synonyms on line 109687 WARNING: Chromosome JH373235.1 not found in annotation sources or synonyms on line 109728 WARNING: Chromosome JH373236.1 not found in annotation sources or synonyms on line 109766 ⁝ ⁝ WARNING: Chromosome AAEX03025927.1 not found in annotation sources or synonyms on line 111974 STATUS: Finished with vep... STATUS: Parsing variants in annotated VCF... STATUS: For any SVs, backfilling Fusion column with gene-pair names... STATUS: Finished! Check results in /data/WES/9_Genome_MuSic2/MAF/MAF/4.maf

olaaustine commented 2 months ago

Hi @jooY-Jeong, Hope this meets you well? To use docker with VEP, the instructions are here To mount the cache resources volume, suggesting you run it this way docker run -v $HOME/vep_data:/data ensemblorg/ensembl-vep \ vep --cache --offline --format vcf --vcf --force_overwrite \ --input_file input/my_input.vcf \ --output_file output/my_output.vcf \ --custom file=custom/my_extra_data.bed,short_name=BED_DATA,format=bed,type=exact,coords=1 \ --plugin NMD Also suggesting you run VEP docker first and then the output can be used with VCF2MAF. Please let us know if this helps Thank you, Ola.

jooY-Jeong commented 2 months ago

Thank you for helping me again.

I'll try it as you said, and if there's another error, I'll be back. Thank you!