bihealth / auto-acmg

Automatic classification of sequence variants and CNVs according to ACMG criteria.
GNU General Public License v3.0
4 stars 0 forks source link

Finish `AutoPM4BP3` #120

Closed gromdimon closed 2 months ago

gromdimon commented 3 months ago

Is your feature request related to a problem? Please describe. We have AutoPM4BP3 after implementation of #74. However, we need to properly implement all the methods in this class and then test it.

Describe the solution you'd like

Describe alternatives you've considered N/A

Additional context Some info for the PM4 and BP3

PM4 (protein length)

Original Definition

Protein length changes due to in-frame deletions/insertions in a non-repeat region or stop-loss variants.

-- Richards et al. (2015); Table 4

Preconditions / Precomputations

Implemented Criterion

User Report

Literature

N/A

Caveats

BP3

BP3 (in-frame repetitive)

.. note::

- We do not have proper Uniprot data yet (domain / repeat)
- Similar to repeat masker.
- Probably same for phylop100way?

Original Definition

In-frame deletions/insertions in a repetitive region without a known function.

-- Richards et al. (2015); Table 4

Preconditions / Precomputations

Implemented Criterion

User Report

Literature

Caveats

Intervar

PM4 and BP3 by Automated Scoring Indels and stop losses can change the length of proteins and disrupt protein function. We annotated the repeat region by using the “rmsk” database from the UCSC Genome Browser. This database was created by the RepeatMasker program, which screens DNA sequences for interspersed repeats and low-complexity DNA sequences. When the variants are “non-frameshift insertion,” “non-frameshift deletion” in the non-repeat region, or stop-loss variants, PM4 will be applied. If the variants are “non-frameshift insertion” or “non-frameshift deletion” in the repeat region, BP3 will be applied.