monarch-initiative / gpsea

A Python library for discovery of genotype-phenotype associations
https://monarch-initiative.github.io/gpsea/stable
MIT License
4 stars 1 forks source link

Erroneous error while loading individuals to cohort. Proteins are not expected to have coordinates for splice variants #108

Closed pnrobinson closed 10 months ago

pnrobinson commented 10 months ago

This is expected. I think we need to slightly refactor the way the analysis starts. We need to give the analysis a reference transcript (not a reference protein). It also does not make sense to test multiple proteins (or transcripts) at once. For any conceivable clinical analysis, people focus on one transcript, which in almost all cases is (or should be) the MANE or ClinVar transcript. Having this much choice will engender confusion and not bring benefits!

Missing start/end coordinate for NM_001330437.2:c.1093-1G>T on protein NP_001317366.1
Missing start/end coordinate for NM_001374625.1:c.1090-1G>T on protein NP_001361554.1
Missing start/end coordinate for NM_002834.5:c.1093-1G>T on protein NP_002825.3
Missing start/end coordinate for NM_080601.3:c.1093-1G>T on protein NP_542168.1
Missing start/end coordinate for NM_001330437.2:c.643-2A>C on protein NP_001317366.1
ielis commented 10 months ago

Look into variant_effects list and check if we have a splice effect. If yes, then we do not log a warning regarding missing protein coordinates.

We don't mind missing the protein coordinates for these effects: