Closed ireneisdoomed closed 2 years ago
I can confirm that we can implement the logic in the ETL, I can't confirm that we can do it in in the 22.02 release. Pipeline freeze is today according to release planning.
@ireneisdoomed I think Success
should be a 1 as well. The meaning of the stop reason Success
is that the study has stopped because of a successful result. Very often this is done to save time and conclude things early, in order to advance in the clinical pipeline.
I see no reason to prioritise the Success
stop reason over a completed study with no stop reason. Also it adds a lot of risks because you could start having scores above 1 (e.g. Phase III + Success)
@JarrodBaker happy to move this to the COULD bucket in the release intentions. Not urgent cc @ktsirigos
@d0choa A completed phase III can fail if the efficacy has not been proven. For example, we are exposing NCT01870778 to account for the relationship between Serelaxin and heart failure. In the results you will see that the p values for the main endpoints do not show efficacy.
This is the rationale behind the distinction.
In any case, you make a very good point, there will be evidence with a score > 1. @JarrodBaker and I just had a chat about it, and in the expression you can indicate that the score this value can be capped to a maximum value as we do for Project Score.
Tbf as I said the impact is minimal. I won't object to keeping it like it is if you think it overcomplicates things.
After discussing it with @d0choa, we won't upweight the success records. This is because ChEMBL lacks many records of Phase IV trials, so if we were to upweight these trials, we would give more importance to a "successful" phase III trial than to one that was followed up to a phase IV trial that we have no record of.
I've updated the table above to reflect this.
@ireneisdoomed In the case of the following record, would the expected score be weighted by a factor of 0.25
or 0.5
?
clinicalStatus | Terminated
datasourceId | chembl
datatypeId | known_drug
diseaseFromSource | Castrate-Resistant Prostate Cancer
diseaseFromSourceMappedId | MONDO_0008315
drugId | CHEMBL92
studyStartDate | 2010-03-01
studyStopReason | Unable to enroll due to criteria for stable baseline pain
studyStopReasonCategories | [Safety or side effects, Negative]
targetFromSource | CHEMBL2095182
targetFromSourceId | Q13509
urls | [{ClinicalTrials, https://clinicaltrials.gov/search?id=%22NCT01083615%22}]
size | 2
s | 0.175
e | 0.25
That is, do I just take a minimum of the mapped studyStopReasonCategories
or calculate their product to find a weight?
@JarrodBaker We just want to use the minimum, so evidence will only be downweighted to a maximum of half.
Thanks!
As a result of the work described in #1878, we want to change the scoring for ChEMBL evidence to enrich the information encapsulated in ChEMBL evidence.
Current scoring is based on the phase of the clinical trial
New scoring will modulate the scoring above based on the
studyStopReasonCategories
The impact
*Success is a very small group. Mostly evidence on phase 2 (183) and phase 3 (152).
Tasks