carpentries-incubator / ml4bio-workshop

Materials for a workshop introducing machine learning to biologists
https://carpentries-incubator.github.io/ml4bio-workshop/
Other
21 stars 8 forks source link

Adding research and motivation examples #63

Open cmilica opened 4 years ago

cmilica commented 4 years ago

If you find some really cool, a little bit mainstream ML in Bio example - please post it here!

cmilica commented 4 years ago

I added the two articles from the previous presentation and then this one Would this one work?

agitter commented 4 years ago

One paper we can consider is Supervised classification enables rapid annotation of cell atlases. They use a very simple classifier (multi-class logistic regression) for a topical biological problem of cell type classification. There are many competing methods for this problem, and they have been benchmarked.

Hidden Stratification Causes Clinically Meaningful Failures in Machine Learning for Medical Imaging is the preprint I mentioned today that discusses confounding in medical imaging. One finding was that

pneumothorax cases without chest drains were highly prevalent (i.e., enriched) in the false negative class

So the presence of the chest drain is incorrectly influencing the predictions. It's a nice cautionary tale, but I'm not certain we want to use too many medical images examples for this biology-centered workshop.

agitter commented 4 years ago

@cmilica we discussed how it was somewhat difficult to find example applications that use decision trees instead of random forests. This paper suggests that at least some new work may be decision tree-based:

PgpRules: a decision tree based prediction server for P-glycoprotein substrates and inhibitors https://doi.org/10.1093/bioinformatics/btz213

I haven't actually read it to see whether it is a good example or confirm they are not ensembling the trees into forests.

cmilica commented 4 years ago

https://aispace2.github.io/AISpace2/index.html

http://immersivemath.com/ila/ch01_introduction/ch01.html

agitter commented 4 years ago

I found this paper to be pretty exciting because it is the same field our group is working in. I think most people can understand the need for new classes of antibiotics.

A Deep Learning Approach to Antibiotic Discovery https://doi.org/10.1016/j.cell.2020.01.021

Due to the rapid emergence of antibiotic-resistant bacteria, there is a growing need to discover new antibiotics. To address this challenge, we trained a deep neural network capable of predicting molecules with antibacterial activity. We performed predictions on multiple chemical libraries and discovered a molecule from the Drug Repurposing Hub—halicin—that is structurally divergent from conventional antibiotics and displays bactericidal activity against a wide phylogenetic spectrum of pathogens including Mycobacterium tuberculosis and carbapenem-resistant Enterobacteriaceae. Halicin also effectively treated Clostridioides difficile and pan-resistant Acinetobacter baumannii infections in murine models. Additionally, from a discrete set of 23 empirically tested predictions from >107 million molecules curated from the ZINC15 database, our model identified eight antibacterial compounds that are structurally distant from known antibiotics. This work highlights the utility of deep learning approaches to expand our antibiotic arsenal through the discovery of structurally distinct antibacterial molecules.

cmilica commented 4 years ago

https://github.com/data-ppf/data-ppf.github.io/wiki/Syllabus