Open-source search for approved drugs for treatment of Coronavirus
Given the scale and spread of the coronavirus breakout, there is a immediate and urgent need for medicines which does not exist for this virus yet. Besides’ Gilead’s clinical trial of their drug, BenevolentAI has recently published this article on Lacent. They used their company’s own knowledge graph, curated originally from scientific literature, and identified baricitinib as a suggested potential treatment.
But clinical biologists are very researved about this. One obvious operational issue is that clinical studies must take place before this approved drug is repurposed for the treatment of coronavirus. But another issue is that as a private company, BenevolentAI’s database and code are not going to be open. Convincing clinical people to do clinical study about a black-box result is almost not gonna succeed. End-to-end model transparency and intepretability is needed.
We would like to reproduce a knowledge graph to recommend approved drugs, in a completely open-source manner (GNU General Public License v3.0) including dataset, pipeline, algorithm and code. In this process, we will constantly seak feedback from clinicians and biotech professionals in order to be communicative and relevant to the Coronavius treatment.
To get started to construct the complex relationships between compounds, genes, diseases an proteins, we need to identify soruces of raw data.
Grakn, a open source graph database utilises the entity-relationship model to group each concept into either an entity, attribute, or relationship. This means that all we have to do is to map each concept to a schema concept type, and recognize the relationships between them.
Reference https://blog.grakn.ai/drug-discovery-knowledge-graphs-46db4212777c
TODO list
Curate a corpus from Pubmed abstracts(Reporting to Alex Li), identify keywords to extracting Pubmed articles (Protein mentioned in Benevelent AI's graph..., anything else? )and download the abstracts using Pubtator API (sample code is ready to be uploaded soon)
Search for a compound-compound similarity database and how to ingest it (Reporting to Lan Gao)
For anyone looking at this page, if you see any thing missing or if you like to help, please do! You can submit pull request or raising a issue.