Ensembl / VEP_plugins

Plugins for the Ensembl Variant Effect Predictor (VEP)
Apache License 2.0
141 stars 117 forks source link

Paralogues plugin: support 'matches' annotation #723

Closed nuno-agostinho closed 2 months ago

nuno-agostinho commented 5 months ago

ENSVAR-6290: The Paralogues plugin will now support a Tabix-indexed TSV file with pre-computed matches between genomic regions and paralogue variants.

Requires https://github.com/Ensembl/ensembl-variation/pull/1099 UPDATE 16b9915: ensembl-variation functions reverted to this PR, too complex to separate the functions in the current design of both repos.

Creating a matches file

  1. Download a ClinVar VCF file.
  2. Run VEP with Paralogues options regions=1,min_perc_cov=0,min_perc_pos=0,clnsig=ignore and enable --vcf option to return VCF output
  3. Run the following command on the VEP output: perl -e "use Paralogues; Paralogues::prepare_matches_file('variant_effect_output.txt', 'paralogue_matches.tsv')"

The resulting file will be a bgzipped, Tabix-indexed TSV matches file that can be used in Paralogues plugin using the matches parameter.

Testing

Try creating a matches file based on a ClinVar VCF file and using the Paralogues plugin with the resulting file.

Example plugin options in VEP:

--plugin Paralogues,clnsig=ignore,matches=/path/to/matches/file.tsv.gz

Example input variants:

1       939112  .       G       A
1       44827194        .       GT      AA
1       8871910 .       C       T
2       166073606       .       G       T
17      4833586 .       G       T
nakib103 commented 2 months ago

merged to main and release/113