OpenSourceMalaria / OSM_To_Do_List

Action Items in the Open Source Malaria Consortium
82 stars 13 forks source link

Competition Results! #538

Open mattodd opened 7 years ago

mattodd commented 7 years ago

The competition (#421) is complete!

Winners:

It’s a TIE. James McCulloch (@kellerberrin) and Ho Leung Hg (@holeung) predicted 2 compounds in the top 20 and both have one hit just outside the top 20. Both worked openly and engaged in discussion about their work as it was happening.

Congratulations!

We’d not considered the possibility of a tie. We’d allocated $500 to the winner. It’s logical that we split the pot and award $250, but that’s just not the OSM way, so $500 will be going to each winner.

Runners-up:

Vito Spadavecchio (@spadavec), Giovanni Cincilla (@gcincilla), Davy Guan (@IamDavyG), Johnathan Silva (@jon-c-silva). Honorable mention also for Murray’s (@murrayfold) original model, which successfully predicted 1 active.

…will each win an item of OSM merch. Please email @mattodd a shipping address, preferred item and details (e.g. T-shirt size and color).

Details of the original entries can be seen here. Pleasingly all the entries adopted different approaches.

Sorry again for the delay in finishing the judging - this would I think all have been a lot simpler if we could have run the competition openly, but the nature of the secret dataset meant we could not, and then the coordination of judges was more difficult than expected.

There is a remaining awkwardness. The pathogen box dataset results (i.e. which compounds are active in the ion regulation assay) are still embargoed since the relevant paper has not yet been submitted for publication. Very sorry about this. This danger was prefigured when the competition was formulated. It means we can’t yet give detailed feedback (i.e. which model predicted which compound correctly), which I know is suboptimal. I’m sorry - I know all entrants will want to see this and iterate the models. We just need to wait a little longer. As soon as the paper is accepted, we can divulge/play with/learn from the data.

At that point, when we can openly discuss the results of the models in detail (I’ll certainly ping everyone and start a new discussion) we can do several important things:

1) Refine the models, with the Pathogen Box data added to the training set. Use improved models on datasets that have emerged in the meantime.

2) Use the models to screen commercially available compounds (start with the affordable ones) to identify new scaffolds to work on. Using more than one model could help to filter our selection and potentially increase the hit rate.

3) Loop in the homology model. Some submissions, like Ho’s, can be used to show docked poses, but others, like James’ predicts the activity purely as a number without consideration of any interaction with a protein.

4) Explain how all these different structures are binding the same target (the main point of interest to the med chemists and malaria people - see this old post for sample structures).

5) Write a paper together on all this. There will perhaps be quite a lot we can say on the models given that they all adopted different approaches.

mcoster commented 7 years ago

It’s logical that we split the pot and award $250, but that’s just not the OSM way, so $500 will be going to each winner.

@mattodd - you are the Oprah Winfrey of Open Science! :wink:

Congrats to the winners - what a great result for all!

alintheopen commented 7 years ago

You get a car, you get a car, you get a car :)

On Mon, Sep 11, 2017 at 10:54 AM, Mark Coster notifications@github.com wrote:

It’s logical that we split the pot and award $250, but that’s just not the OSM way, so $500 will be going to each winner.

@mattodd https://github.com/mattodd - you are the Oprah Winfrey of Open Science! 😉

Congrats to the winners - what a great result for all!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/OpenSourceMalaria/OSM_To_Do_List/issues/538#issuecomment-328388632, or mute the thread https://github.com/notifications/unsubscribe-auth/ACgUJ14Ao7a-heqQ8-zWtRMj89NoyAJpks5shITWgaJpZM4PSgk7 .

drc007 commented 7 years ago

Congratulations to all who took part.

kellerberrin commented 7 years ago

Thanks very much Matt.

I am thrilled to be named co-winner given the very high standard of the other competition participants.

In particular, I would like to thank @spadavec who posted his machine learning (TensorFlow) code early in the competition. Reading this code helped considerably in developing my own code base and ideas.

I guess that is the spirit of OSM collaboration.

BTW. I am currently re-writing Jeremy Horst's excellent MinorityReport software in C++ for increased performance (those SAM files are big). Genome sequencing is a very exciting technique for identifying drug action and associated resistance. Will post results to this forum in due course.

holeung commented 7 years ago

Fun! I'd be happy to donate my award as a scholarship or supplies for the high school students. I'll be happy with an OSM T-shirt. I look forward to analyzing the results when ready. Also will be happy to help coordinate the writing of an eventual modeling paper.

spadavec commented 7 years ago

@mattodd thanks for the results, and congrats everyone! just a quick question, is the 'hidden' dataset activity in the PfATP4 test, or just whole cell EC50?

gcincilla commented 6 years ago

Congratulation to everybody!! Especially @kellerberrin and @holeung. Well done guys!

It is a pity that the pathogen box dataset results are still embargoed and we cannot know the details but we’ll wait for that. I hope you can communicate it in this forum, once it will be possible. @mattodd, it will be good to know at least how many active molecules there were in the hidden set and which type of activity was used to tag them as “actives”, as also @spadavec asked (PfATP4 test, or whole cell EC50). Is it possible?

A part from this it will be good to know what will be the next steps and if we can collaborate or we can be useful somehow. Cheers!

holeung commented 6 years ago

I can be a coordinator/co-coordinator to put together the modeling work section for the OSM Series 4 paper and a potentially second paper focused on modeling Series 4. I think we can also put together an interesting modeling paper for Series 3 after we get more experimental results. I am also open to co-writing grant applications with OSM team members.

mattodd commented 6 years ago

Hi all.

I'm speaking with Kiaran on Friday and will update people on how the dataset/publication is looking. Once the data are released (PfATP4 activity, @spadavec ) I think it's going to be exciting for us all to compare approaches to try to improve the models. Such activity would be towards a standalone paper that we can write together - thanks for the offer of coordinating some of this writing, Ho. Excited to do this and involve everyone.

Ho's mention of grant proposals is also relevant and timely. I think we should be doing this quite generally - leveraging what we're doing already to secure some distributed funds (in a way that is currently quite challenging). It strikes me that, with the current push by MMV to throw light on the MoA of candidate compounds (we just sent Series 3 for this, see #524, but it's part of a bigger program involving MMV, UCSD and Gates money) then there might be a fair amount of data coming in relevant to small molecules and their targets that could form the basis for a considerable amount of work in trying to be more predictive about which molecules to make next across several projects. That is, after all, the aim of what we've been doing - paring down the number of candidates via a good predictive filter.

To more mundane matters, we still need to send out gifts/money to those of you for whom I have no email addresses. Could @kellerberrin @spadavec @IamDavyG and @jon-c-silva please send me a quick email so that we can be in touch with potentially confidential things such as bank details or shipping addresses? matthew.todd at sydney.edu.au

kellerberrin commented 6 years ago

@kellerberrin

Hi Matt,

My email is: james.duncan.mcculloch@gmail.com

If I can be of any assistance in writing the proposed paper please let me know.

I have been developing some genomics software and in initial testing I have noticed what could be an interesting series of SNP mutations in PfATP4 (gff3: PF3D7_1211900) from sequenced falciparum organisms (NCBI accession: PRJNA173723).

When I've stabilised the software and sent the PfATP4 variants off to ITasser, I will write up a report for OSM.

Best Regards,

James McCulloch.

On 31 October 2017 at 13:46, Mat Todd notifications@github.com wrote:

Hi all.

I'm speaking with Kiaran on Friday and will update people on how the dataset/publication is looking. Once the data are released (PfATP4 activity, @spadavec https://github.com/spadavec ) I think it's going to be exciting for us all to compare approaches to try to improve the models. Such activity would be towards a standalone paper that we can write together - thanks for the offer of coordinating some of this writing, Ho. Excited to do this and involve everyone.

Ho's mention of grant proposals is also relevant and timely. I think we should be doing this quite generally - leveraging what we're doing already to secure some distributed funds (in a way that is currently quite challenging). It strikes me that, with the current push by MMV to throw light on the MoA of candidate compounds (we just sent Series 3 for this, see #524 https://github.com/OpenSourceMalaria/OSM_To_Do_List/issues/524, but it's part of a bigger program involving MMV, UCSD and Gates money) then there might be a fair amount of data coming in relevant to small molecules and their targets that could form the basis for a considerable amount of work in trying to be more predictive about which molecules to make next across several projects. That is, after all, the aim of what we've been doing - paring down the number of candidates via a good predictive filter.

To more mundane matters, we still need to send out gifts/money to those of you for whom I have no email addresses. Could @kellerberrin https://github.com/kellerberrin @spadavec https://github.com/spadavec @IamDavyG https://github.com/iamdavyg and @jon-c-silva https://github.com/jon-c-silva please send me a quick email so that we can be in touch with potentially confidential things such as bank details or shipping addresses? matthew.todd at sydney.edu.au

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/OpenSourceMalaria/OSM_To_Do_List/issues/538#issuecomment-340643961, or mute the thread https://github.com/notifications/unsubscribe-auth/AG-NhaigY7gmmvDDwz08oIFrlFxRyuTsks5sxon7gaJpZM4PSgk7 .

-- Best Regards,

James

jonjoncardoso commented 6 years ago

Hi all,

Congratulations to @kellerberrin and @holeung for winning the competition!

I have only just had time to properly check the feed here at Github. It is really interesting to do research in the open, I have learned a lot from this project already and I am really excited to keep working with all of you in the new avenues of research once the embargo on the data set is down.

I will send you the e-mail now, Matt.

kellerberrin commented 6 years ago

Hi Fellow OSMers

On 31-October-17 I wrote this:

_I have been developing some genomics software and in initial testing I have noticed what could be an interesting series of SNP mutations in PfATP4 (gff3: PF3D71211900) from sequenced falciparum organisms (NCBI accession: PRJNA173723).

It's now 31-March-18 and I finally have the results of that PfATP4 SNP screen (attached as a PDF).

It appears that there are small but significant populations of Multi-drug resistant genotypes in Africa.

The mutation is G223R and the it confers resistance to Spiroindolones (KAE609 & KAE678) and Aminopyrazoles (GNF-Pf4492).

Best Regards,

James McCulloch.

G223R_Mutation.pdf