Closed Juanmaria-rr closed 2 months ago
Looking at the ETL code, the value for variantEffect
is based on variantFunctionalConsequenceFromQtlId
, a value generated during the OTG evidence generation.
@remo87 @Juanmaria-rr Can you confirm this is true? If so, it's up to us to filter out sQTLs in the logic and we'd have it fixed by next release.
Yes @ireneisdoomed, I think that would be the step to include to remove the sQTL effect sizes from the DoE assessment.
It's here So, the data team can update the parsers by filtering out sQTLs from the aggregation.
In #3282, @addramir and @Daniel-Considine suggested changing the logic of the aggregation. Instead of using the largest effect size, which could be problematic for poorly powered studies, use the lowest p-value. I would try to introduce this change in the same script as well.
@Juanmaria-rr will have some estimates on the impact of this change.
@buniello This will imply changing a tooltip and documentation.
The parser and the data is updated. The data is here:
gs://otar000-evidence_input/Genetics_portal/json/genetics-portal-evidence-2024-04-16.json.gz
Questions:
Answers:
Background
The current interpretation of splicing QTL (sQTL) beta coefficients for the direction of effect assessment is not informative, where negative are interpreted as loss of function and positive ones as gain of function.
For now we propose to remove them from the assessment for the direction of effect (DoE), so we will need to filter them out when building the evidences.
This issue is also addressing the point 1 of the issue raised by @Daniel-Considine here.
Tasks
get_biggest_effect
function) in the generation of the variantFunctionalConsequenceFromQtlId evidence column: https://github.com/opentargets/evidence_datasource_parsers/blob/ea3008a6143fd1347a3629d5869fd50438004abc/modules/GeneticsPortal.py#L84Acceptance tests
When direction of effect data shows no directionality from sQTL evidences.
How do we know the task is complete?
Lymphocyte count
associated withTYK2
(link), for the variant 19_10351837_C_T, the DoE isloss of function | Risk
, where the loss of function comes from theDecreased gene product level
derived from the highest effect size of sQTLs. After removing sQTL, this should appear as again of function | Risk
.