eastgenomics / eggd_generate_variant_workbook

DNAnexus app for generating xlsx variant workbooks
3 stars 0 forks source link

Decipher link #121

Closed kjwinfield closed 1 year ago

kjwinfield commented 1 year ago

Summary

DECIPHER link

DECIPHER is a database of rare disease variants (https://www.deciphergenomics.org/). This update adds a link to the variant page in DECIPHER for variants reported in the workbooks. An extra column is added when the --decipher input is specified. This column is populated with links to the variant in DECIPHER. These links are only generated for build 38 variants, as DECIPHER only stores build 38 data

Example output: DECIPHER column with links added to spreadsheet

Testing on DNAnexus

This version of the app was build as an applet on DNAnexus to test. In this project: 004_220720_eggd_generate_variant_workbook_v2.2.0_testing:/decipher_link the input .vcf and applet can be found.

A job was run with and without the --decipher input

with --decipher input

dx run applet-GFyK74842Zj5QJX94jxP2bGP -ivcfs=file-GFyK8yj42ZjGg1QK4vfqQQfq -idecipher=true

Job ID: job-GFyKG6042Zj7b71g4gYY2bZY Output excel file: Screenshot from 2022-08-17 14-19-20

without --decipher input

dx run applet-GFyK74842Zj5QJX94jxP2bGP -ivcfs=file-GFyK8yj42ZjGg1QK4vfqQQfq

Job ID: job-GFyKf9Q42Zj47j2B4x2ZJ1fb Output excel file: Screenshot from 2022-08-17 14-29-23

Tests

Tests have been added to the tests.vcf file to test:

  1. That the function to add a DECIPHER column works
  2. That this column is only added when --decipher input is given
  3. That the DECIPHER column and links are not generated for build 37 vcfs, as DECIPHER only stores build 38 data
  4. That the hyperlinks for the variants in DECIPHER are assembled correctly.
  5. That the hyperlinks for variants in gnomAD (https://gnomad.broadinstitute.org/) are correct, for both build 37 and build 38 variants.

Unit tests

$ pytest
pytest
============================================================================ test session starts =============================================================================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /home/katherine/eggd_generate_variant_workbook
collected 42 items                                                                                                                                                           

resources/home/dnanexus/generate_workbook/tests/test_columns.py ...........                                                                                            [ 26%]
resources/home/dnanexus/generate_workbook/tests/test_filters.py .......                                                                                                [ 42%]
resources/home/dnanexus/generate_workbook/tests/test_utils.py ........                                                                                                 [ 61%]
resources/home/dnanexus/generate_workbook/tests/test_vcf.py ................                                                                                           [100%]
============================================================================ 42 passed in 30.01s =============================================================================

This change is Reviewable

pep8speaks commented 1 year ago

Hello @kjwinfield! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 362:80: E501 line too long (83 > 79 characters)

Line 50:80: E501 line too long (93 > 79 characters) Line 51:80: E501 line too long (93 > 79 characters)

Comment last updated at 2022-08-17 12:45:56 UTC