Ensembl / ensembl-vep

The Ensembl Variant Effect Predictor predicts the functional effects of genomic variants
https://www.ensembl.org/vep
Apache License 2.0
437 stars 149 forks source link

can not call method "seq" #1615

Open Marije179 opened 4 months ago

Marije179 commented 4 months ago

Hi, I am trying to run VEP, but I keep getting this issue (and output file is empty):

Can't call method "seq" on an undefined value at /home/WUR/steen179/miniforge3/share/ensembl-vep-111.0-0/Bio/EnsEMBL/Transcript.pm line 828, <__ANONIO__> line 379.

Someone knows what this issue means and how to solve it?

nuno-agostinho commented 4 months ago

Hi @Marije179,

Thank you for reporting this issue.

Could you please tell me what command are you using to run VEP? If possible, also send a sample of the input variants that reproduce the error. Thanks!

Best regards, Nuno

Marije179 commented 4 months ago

Hi Nuno,

Thankyou for your fast response.

I am using this command to run VEP: vep -i syri.vcf --gff .gff3.gz --fasta .fasta -o syri_VEP.vcf

I use a vcf file generated with syri.

the gff3 file is sorted and indexed with these commands (as provided on ensembl-vep): grep -v "#" data.gff | sort -k1,1 -k4,4n -k5,5n -t$'\t' | bgzip -c > data.gff.gz tabix -p gff data.gff.gz

the gff3 file is a pseudo-annotation of a animal breed assembly fasta I generated (specificied in --fasta).

It is not possible to send the input variants.

Thankyou in advance!

nuno-agostinho commented 4 months ago

Hey @Marije179,

Your command seems fine. Is the error message complete? Does it only show that?

From what I can understand, while running VEP with your files, there may be a transcript whose sequence cannot be retrieved. Please do the following:

  1. Check if the FASTA file is as expected for the coordinates of some transcripts specified in the GFF annotation.
  2. Check if the chromosome names are the same between the GFF and FASTA files.

Please report me your findings and if there is anything else to add. Thanks!

Cheers, Nuno

Marije179 commented 4 months ago

Hi Nuno,

It also gives me these warning messages (a few more warning messages for some types that are not supported, but just to show you the short version):

WARNING: line 309 skipped (Chr location type N . PASS END=...): HDR type is not supported WARNING: line 310 skipped (Chr location type N . PASS END=...): SYNAL type is not supported WARNING: line 311 skipped (Chr location type N . PASS END=...): NOTAL type is not supported WARNING: Ignoring non-supported 'three_prime_UTR' feature_type from NAMEFILE.gff3.gz WARNING: Ignoring non-supported 'five_prime_UTR' feature_type from NAMEFILE.gff3.gz Can't call method "seq" on an undefined value at /home/WUR/steen179/miniforge3/share/ensembl-vep-111.0-0/Bio/EnsEMBL/Transcript.pm line 828, <__ANONIO__> line 379.

I checked the things you suggested, but the sequences in the reference fasta and the GFF3 file are the same and also the chromosome names between the fasta and gff3 are the same.

nuno-agostinho commented 4 months ago

Hi @Marije179,

That's okay. I suppose you are running a list of variants as input (at least 311 from what I can see in your warnings).

Could you try creating small files and running VEP with them? This would allow you to narrow down which variant is the problematic. You can split your input file using split -l 50 syri.vcf syri.vcf_, which will split your input into small files with 50 variants each.

Unfortunately, I cannot do much more without knowing more information on what is wrong, so do tell me if find something interesting.

Cheers, Nuno

Marije179 commented 4 months ago

Thankyou.

I see that this variant is causing problem: Chr1 location INV338 N . PASS END=location;ChrB=Chr1;StartB=location;EndB=location;Parent=.;VarType=SR;DupType=-

When I then remove that variant from the vcf file, it runs fine, but when I add more variants to the VCF file, another variant is causing problems:

Chr1 location DUP696 N . PASS END=location;ChrB=Chr1;StartB=location;EndB=location;Parent=.;VarType=SR;DupType=copygain

and same here, when removing this variant and perform VEP with the same VCF file it runs fine, but with adding more variants to it, leads again somewhere that another variant is disrupting the annotation...