Dlux804 / McQuade-Chem-ML

Development of easy to use and reproducible ML scripts for chemistry.
5 stars 1 forks source link

Suppress SMILES Parse Error in ingest.py #11

Closed Dlux804 closed 4 years ago

Dlux804 commented 4 years ago

SMILES Parse Error not caught and suppressed by try: except: clauses.


        try:
            pd.DataFrame(list(map(Chem.MolFromSmiles, csv[i])))
            smiles_col = csv[i]
        #                molob_col = pd.DataFrame(molob, columns = 'molobj')
        except: # TODO: suppress these SMILES Parse Error
            pass```
Dlux804 commented 4 years ago

@qle2 Try running model.py from the command line in pycharm to reproduce.

python models.py

I get the following

[11:59:17] SMILES Parse Error: syntax error while parsing: pyrrolidine
. . . 
[11:59:17] SMILES Parse Error: Failed parsing SMILES 'pyrrolidine' for input: 'pyrrolidine'
[11:59:17] SMILES Parse Error: syntax error while parsing: 4-hydroxybenzaldehyde
[11:59:17] SMILES Parse Error: Failed parsing SMILES '4-hydroxybenzaldehyde' for input: '4-hydroxybenzaldehyde'
[11:59:17] SMILES Parse Error: syntax error while parsing: 1-chloroheptane
[11:59:17] SMILES Parse Error: Failed parsing SMILES '1-chloroheptane' for input: '1-chloroheptane'
[11:59:17] SMILES Parse Error: syntax error while parsing: 1,4-dioxane
[11:59:17] SMILES Parse Error: Failed parsing SMILES '1,4-dioxane' for input: '1,4-dioxane'
You have selected the following featurizations:    rdkit2d
Calculating features... Done.
andreshyer commented 4 years ago

This has been address in the most recent PR. To fix this in the future, be sure to add from rdkit import RDLogger RDLogger.DisableLog('rdApp.*') when importing RDkit