A Pharmacophore Competition - thoughts/input gratefully received.

OSM want to launch a competition to build a pharmacophore model for series four. It will be based on existing open data, and that the models will be tested on a test dataset that will be available later in the year. Key links can be found in the project wiki (please feel free to edit as well as use)

I've opened this issue so that we can iron out how this might work. I've sketched out a rough proposal below, but would really welcome any input on how to improve the competition and some answers to specific questions. Once we've developed the competition guidelines as a community, I will close this issue and repost the edited guidelines when we launch the competition for real.

Outline We need to build a predictive pharmacophore model for PfATP4. PfATP4 is a sodium pump found in the membrane of the malaria parasite. A number of promising antimalarial compounds, with distinct and diverse chemical structures, have been found active in an ion regulation assay, which was developed by Kiaran Kirk's lab at ANU. A number of publications have indicated that this ion channel looks to be an important new target for malaria medicines. It seems that PfATP4 active compounds disrupt the ion channel and cause the rapid influx of Na+ into the parasite, leading to it's demise. The structure of PfATP4 is not known. Simulations, based on docking of PfATP4 actives, have used a homology model developed by Joseph DeRisi's laboratory. OSM want to build a predictive pharmacophore model to assist in design and synthesis of new Series Four compounds and of course to help others working on other compound series.

The first attempt @murrayfold had a quick, informal go (here and here) at the development of a pharmacophore model using known actives and inactives from the malaria box. At the time Kiaran Kirk's paper was under embargo but Murray has since written up his work. This initial attempt was unsuccessful (i.e. not predictive - see image below, where the "P Model predictions" correlate poorly with what was found in the ion regulation assay) possibly because the model did not allow for overlapping binding sites or take into consideration compound chirality.

compounds sent to kirk and fidock

The Competition We need a predictive in silico model, the best* model will win the prize**

How will it work? OSM will provide:

a csv or sdf file containing details of actives and inactive compounds NB This list is more substantial than the original dataset used by Murray. So you could also use the new data as a 'test set' for any model developed.
details of the relevant known mutations known to be associated with resistance

Submission Rules

all entries to be submitted to GitHub and shared openly
entrants can work individually or in teams (no limit to team size)
entrants must work openly. This doesn't necessarily mean that inputs have to be logged in realtime (although that would be great), but entries that have not openly deposited data prior to the deadline will not be accepted.
entrants must agree to their work's incorporation into a future OSM journal publication(s)
competition winner(s) will be authors on any relevant future paper(s)
any valid*\ entries will at least be acknowledged on any relevant future paper(s) and if the contribution is significant may be authors.

How will entries be assessed?

*The model will be tested using the 'test set' of PfATP4 molecules. These molecules are not currently in the public domain as they are from diverse closed/unpublished projects.
Using a statistical test (what would this be dear cheminformaticians?) the judges will find the winning model.

What's the prize **TBC Along with the opportunity to contribute to our understanding of a new class of antimalarials and authorship on a peer-reviewed publication.

What if none of the models are any good? Good question. If all models fail to meet said statistical test then it may not be possible to announce a 'winner'. All data will be collated and published at least in the form of a blog, if not a paper...and then we will try again.

Deadline TBC but potentially end of August 2016

***A 'valid' entry is one that stands up to the rigour expected from published in silico models. Judges are entitled to use digression in the case of unconventional entrants, for example those from people with no formal training such as high school students.

OpenSourceMalaria / OSM_To_Do_List

A Pharmacophore Competition - thoughts/input gratefully received. #412