Closed Heredia-Maria closed 1 year ago
Hi @Heredia-Maria,
Thank you for your query.
Please can you confirm what release version of VEP you are using and the assembly?
Thank you very much, Ola.
Hi @olaaustine.
I'm using release 109 and GRCh38. Thanks for your quick response.
María.
Hi @Heredia-Maria,
Thank you for your response
From release_109, you can prioritize mane_plus_clinical, by changing the pick_order.
According to our documentation here, --pick_order
can be used to customize the criteria when using VEP
--pick --pick_order mane_plus_clinical,mane_select,canonical,appris,tsl,biotype,ccds,rank,length
Let us know if this solves the problem. Thank you Ola.
Hi Ola. I will try as soon as possible and give you feedback.
Thank you María.
Hi Ola.
The prioritization is now working for most genes! However, I still have some issues. For example, in the case of NOX1 "ENSG000007952", which have two gold isoforms in Ensembl, the one prioritized is not the gold MANE isoform, but the other one. Why is this happening? Finally, my main objective would be to annotate the variants on the same sequences that are collected in UniProt, but I am not succeeding.
Thank you very much María.
Hi @Heredia-Maria,
Thank you very much for letting us know.
To better investigate the issue, what transcripts are the two gold isoforms in Ensembl?
The gene NOX1 in the example has the transcript ENST00000372966.
Thank you very much Ola.
Hi @olaaustine,
The two gold isoforms in Ensemble are: ENST00000372966.8 (MANE) and ENST00000217885.5. The isoforms on which my variants have been annotated are: ENST00000217885 (16 variants) and ENST00000372960 (2 variants). Not the MANE Select here...
Thank you very much. María.
Hi @Heredia-Maria,
Thank you very much for your response.
To try to recreate the issue, please can you share the command used and the example variant in this case if thats possible
Thank you very much Ola.
Hi @olaaustine
The command used is the following: ./vep --offline --cache --fa Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz --format vcf --tab -i ./../Escritorio/DataBasesData/clinvar_20230213.vcf --show_ref_allele --total_length --mane --verbose --variant_class --force_overwrite --hgvs --symbol --uniprot --gencode_basic --canonical --biotype --exclude_predicted --no_intergenic --protein --shift_3prime 1 --pick_order biotype,ccds,rank --pick --custom clinvar.vcf.gz,ClinVar,vcf,exact,0,CLNSIG,CLNDN -o NMD_20230213_output.txt -plugin Downstream -plugin ProteinSeqs,references.fa,mutated.fa -plugin NMD
The input file is the whole bunch of variants downloaded from ClinVar. I attach here an vcf with the variants just for NOX1. NOX1.txt
Thank you very much, María.
Hi @Heredia-Maria,
Looking at your command, the --pick_order should be --pick_order mane_plus_clinical,mane_select,canonical,appris,tsl,biotype,ccds,rank,length
which priorities mane.
More about the gold isoforms in Ensembl can be seen here Please let us know if this helps. Thank you Ola
Hi @olaaustine,
I copied the older version of my command here, but I did it correctly on the terminal:
./vep --offline --cache --fa Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz --format vcf --tab -i ./../Escritorio/DataBasesData/clinvar_20230213.vcf --show_ref_allele --total_length --mane --verbose --variant_class --force_overwrite --hgvs --symbol --uniprot --gencode_basic --canonical --biotype --exclude_predicted --no_intergenic --protein --shift_3prime 1 --pick --pick_order mane_plus_clinical,mane_select,canonical,appris,tsl,ccds,rank,length --custom clinvar.vcf.gz,ClinVar,vcf,exact,0,CLNSIG,CLNDN -o NMD_20230213_output_prioritization.txt -plugin Downstream -plugin ProteinSeqs,references.fa,mutated.fa -plugin NMD
I noted I forgot the "biotype" flag, I'm going to repeat the process with this flag and let's see...
Sorry for the inconvenience, Thank you, María
Hi @olaaustine
I was also working with the old output!! It is everything working properly, even without the biotype flag. Anyways I will try to use it and compare both outputs.
Thank you so much for your help and your patience. Kind regards, María.
Hi @Heredia-Maria,
Thank you very much for letting us know.
I will close this ticket now. Please feel free to open another ticket or reopen this if you have another query.
Thank you. Ola.
Dear Ensembl team: I came across with an issue during the analysis of my otput. I would like to prioritize variants in MANE PLUS CLINICAL transcrips when available, in front of MANE isoforms. I have seen the flag pick_order in your vep documentation. However, I can not see any specific way to prioritize this PLUS CLINICAL field. Valid criteria are: [ canonical appris tsl biotype ccds rank length mane ]. e.g.:
I would like to know if is there any way, please
Thanks in advance. Kind regards,
María.