Open ireneisdoomed opened 1 year ago
An update on the situation after the meeting with ChEMBL on 28/09:
Dailymed automatic pipeline is not production ready. The primary challenge revolves around the grounding of entities (both drug and indication).
Potential Solution with EMA’s public assessment reports. This seems promising because for drugs like Nivolumab, they've successfully identified all 13 indications that are reported in Dailymed. This will greatly help fill the gap of missing indications with a high quality resource. However, it is also important to note that there might be discrepancies between approvals from FDA (as reported in Dailymed) and the EMA. We expect to see this take effect in the next ChEMBL 34 release.
An user from Community has reported that Nivolumab has missing approved indications compared to the information in Dailymed. Specifically, we only cover 4 of the 13 references we could be extracting from Dailymed.
Observed behaviour
We collate information about clinical precedence through ChEMBL from 2 main sources:
We have an asynchronism between the data we have, and what is reported in Dailymed. Specifically, the user is reporting that Nivolumab has been approved for colorectal cancers, head and neck, or mesothelioma, however we only display information for these indications up to the phase III CT.
This mismatch is very likely to be happening for other drugs.
Expected behaviour
Dailymed's nivolumab label information indeed collects this information. The dataset of indications is the combination of manual curation and an automated pipeline that extracts drug indications from Dailymed. We want to automatically parse Dailymed regularly to bring the mismatch between Dailymed and Open Targets to a minimum.
Tasks
Acceptance tests
How do we know the task is complete?
indications
dataset and filter forCHEMBL2108738
, I see thatEFO_0000181
,EFO_0000182
,MONDO_0007576
,EFO_1001480
,MONDO_0001056
,EFO_0000478
,EFO_0008528
,EFO_0000770
have been added to theapprovedIndications
array.