OpenSourceMalaria / Series4_PredictiveModel

Can we Predict Active Compounds in OSM Series 4?
7 stars 10 forks source link

COVID-19 Update/Finalisation of the Paper #26

Closed edwintse closed 4 years ago

edwintse commented 4 years ago

Hi all,

In light of the COVID-19 situation, most places are going into closedown and people are starting to work from home (including UCL from this Friday). Even though I had finished making Exscientia 1b (#22), Dundee has also closed down today so we won't be able to get that tested any time soon.

The paper is in good shape (#3) and our current plan is to submit as is (i.e. with the 4 original suggested compounds) as soon as possible and go from there, with the possibility of sneaking in the remaining two compounds later on. This means we can begin to finalise everything for the paper. Considering the topical nature of this work, we are thinking of shooting for Nature Comms as a first go.

Here's what remains:

Once all these things are done, I can finalise all the formatting things for submission.

I hope that everyone stays safe over the next few months and everything will get back to normal in the not too distant future.

btatsis commented 4 years ago

@edwintse hi, Laksh just added the refs in the experimental section of the article.

jonjoncardoso commented 4 years ago

Hi @edwintse, I have started adding the details of my model.

Sorry for the delay, I have been opening the GDocs for weeks but never quite find some peaceful time to edit it. I'm reserving some time later today to focus on finishing it!

jonjoncardoso commented 4 years ago

I finished my part on the paper.

Hope everyone is safe and taking care of themselves !

edwintse commented 4 years ago

Thanks @jonjoncardoso!

edwintse commented 4 years ago

Hi @BenedictIrwin, I just noticed in the paper you provided a ref marker for [Hunt2020] but it's not in the reference list at the end. Do you have the full reference for this?

BenedictIrwin commented 4 years ago

Hi @edwintse , the the review process for the Hunt2020 paper has been delayed beyond expectation so we'd better drop the reference for now if you need to move forward with submission etc.

BenedictIrwin commented 4 years ago

If you can include partial information, or (submitted paper) then the information is:

Title: "Predicting pKa Using a Combination of Semi-Empirical Quantum Mechanics and Radial Basis Function Methods" Author(s): Hunt, Peter; Hosseini-Gerami, Layla; Chrien, Tomas; Plante, Jeffrey; Ponting, David; Segall, Matthew

mattodd commented 4 years ago

Hi @spadavec - I think the final final final thing we need for submission of this paper are some details of your method for the SI. Any chance you're able to write something or ping @edwintse with some deets?

spadavec commented 4 years ago

@mattodd sorry for delays--life has been very hectic lately. Will be able to update @edwintse by tomorrow evening.

mattodd commented 4 years ago

Hi @spadavec - Any chance we can access your last updates here? Keen to get this in!

BenedictIrwin commented 4 years ago

Hunt2020 is now associated to doi: http://dx.doi.org/10.1021/acs.jcim.0c00105

edwintse commented 4 years ago

@BenedictIrwin Great, I'll add it back in

mattodd commented 4 years ago

Hi @spadavec - I need to forward you the final version of the paper. Are you perhaps able to connect by email? Alternatively we can post on Github, no problem, but we'll need your final input and approval to submit.

spadavec commented 4 years ago

hi @mattodd sorry for the delays--have been very ill in hospital recently. I get home tonight and should be able to gather everything either tonight or early tomorrow morning. Again, massive apologies for delays.

spadavec commented 4 years ago

Hello @mattodd

1) I've updated the google doc with the my name/info, updated Table 2 wit the once sentence summary, and added a summary of methods (highlighted in orange).

2) I'm still a little confused about the scoring of the models/predictions. In the announcement, the point was made that the goal was to identify "active" molecules (IC50 < 1uM):

https://github.com/OpenSourceMalaria/Series4_PredictiveModel/issues/1#issuecomment-521657658

Indeed, the winning entry/model was a classification model. However my model seems to have been graded by the regression prediction alone (which was supposed to be just a component to my ensemble model). This wasn't made exactly clear in my submission, so I apologize for that. However, when my model is compared to the results provided (and labeling test compounds as "active" which have activity < 1 uM), my accuracy is actually 79%, not 36% (MCC of 0.42). I've summarized these results here :https://docs.google.com/spreadsheets/d/173Ib9jCfwpV9656yo0wKqrakpdY0TyYRdzDcxqYz2U8/edit#gid=0

Can this be reflected in the results/summary? Thanks!

edwintse commented 4 years ago

Hi @spadavec, thanks for adding the details in. I'm having a look at addressing this in the table but can't access your google sheet. Are you able to share?

spadavec commented 4 years ago

Hi @edwintse sorry--thought I had opened it for access to all. I just adjusted it. Please let me know if you need help accessing or understanding the sheet.

edwintse commented 4 years ago

Hi @spadavec, regarding addressing your model, I've changed the result of yours to 79% and added two footnotes to Table 2 which you can see in the google docs version. One is for the precision column heading stating the percentages are based on regression predictions. The other is for yours and Molomics model stating they are based on classification predictions. Is this ok?

spadavec commented 4 years ago

@edwintse Yes, I think that works! Much appreciated, and let me know if you need anything else.

mattodd commented 4 years ago

Thanks for inputs, everyone. Closing this Issue since we're progressing things in #28.