Title: [DL Edition] T035: SMILES based property prediction
Original authors: Azat Tagirdzhanov
Reviewer(s): XXX
Date of review: DD-MM-YYYY
Content
One line summary: SMILES based property prediction using recurrent neural networks
Potential labels or categories (e.g. machine learning, small molecules, online APIs): XXX
Time it took to execute (approx.):
[x] I have used the talktorial template and followed the content and formatting suggestions there
[x] Packages must be open-sourced and should be installable from conda-forge. If you are adding new packages to the TeachOpenCADD environment, please check if already installed packages can perform the same functionality and if not leave a sentence explaining why the new addition is needed. If the new package is not on conda-forge, please list them and their intended usage here.
numpy, pandas, matplotlib: Already in TeachOpenCADD
pytorch 1.12.1 (conda-forge): I use it for implementing neural networks
[x] Data must be publicly available, preferably accessible via a webserver or downloadable via a URL. Please list the data resources that you use and how to access them:
QM9 dataset: Added to the repository (12M)
Content style
[x] Talktorial includes cross-references to other talktorials if applicable
[ ] The table of contents reflects the talktorial story-line; order of #, ##, ### headers is correct
[x] URLs are linked with meaningful words, instead of pasting the URL directly or linking words like here.
[ ] I have spell-checked the notebook
[ ] Images have enough resolution to be rendered with quality, without being too heavy.
[x] All figures have a description
[x] Markdown cell content is still in-line with code cell output (whenever results are discussed)
[x] I have checked that cell outputs are not incredibly long (this applies also to DataFrames)
[x] Formatting looks correctly on the Sphinx render (bold, italics, figure placing)
Code style
[x] Variable and function names follow snake case rules (e.g. a_variable_name vs aVariableName)
[ ] Spacing follows PEP8 (run Black on the code cells if needed)
[ ] Code line are under 99 characters each (run black-nb -l 99)
[ ] Comments are useful and well placed
[x] There are no unpythonic idioms like for i in range(len(list)) (see slides)
[x] All 3rd party dependencies are listed at the top of the notebook
[ ] I have marked all code cell with output referenced in markdown cells with the label # NBVAL_CHECK_OUTPUT
[ ] I have identified potential candidates for a code refactor / useful functions
[ ] All import ... lines are at the top (practice part) cell, ordered by standard library / 3rd party packages / our own (teachopencadd.*)
[x] I have used absolute paths instead of relative paths
Talktorial review
Details
Content
conda-forge
. If you are adding new packages to the TeachOpenCADD environment, please check if already installed packages can perform the same functionality and if not leave a sentence explaining why the new addition is needed. If the new package is not onconda-forge
, please list them and their intended usage here.numpy
,pandas
,matplotlib
: Already in TeachOpenCADDpytorch 1.12.1
(conda-forge): I use it for implementing neural networksContent style
here
.DataFrames
)Code style
a_variable_name
vsaVariableName
)black-nb -l 99
)for i in range(len(list))
(see slides)# NBVAL_CHECK_OUTPUT
import ...
lines are at the top (practice part) cell, ordered by standard library / 3rd party packages / our own (teachopencadd.*
)Website
We present our talktorials on our TeachOpenCADD website (https://projects.volkamerlab.org/teachopencadd/), so we have to check as well if the Jupyter notebook renders nicely there.
nblink
file by runningpython generate_nblinks.py
from within the directoryteachopencadd/docs/talktorials
.