MobleyLab / chemper

Repository for Chemical Perception Sampling Tools
MIT License
19 stars 10 forks source link

[WIP] Add nbval to track jupyter notebooks #18

Closed bannanc closed 6 years ago

bannanc commented 6 years ago

I'm adding nbval package so I can test the jupyter notebooks in examples. For now I'm using the option py.test --nbval-lax which only checks that the notebook runs without error. The normal call py.test --nbval also checks that when the notebook is rerun the output doesn't change. However, I'm using dictionaries where the printed output isn't always consistent so for now I'm not checking out.

From this call you can add #NBVAL_CHECK_OUTPUT to the cells where you wish to know if the output is the same as that in the stored notebook.

codecov-io commented 6 years ago

Codecov Report

Merging #18 into master will decrease coverage by 21.1%. The diff coverage is n/a.

bannanc commented 6 years ago

I initially did some reorganizing on this branch assuming the tests would pass, but I've moved most of that rearrangement to PR #20.

The call py.test -v -s --nbval-lax --cov=chemper/ passes locally so I'm still trying to figure out what is missing from the environment here.

bannanc commented 6 years ago

Just for the record, the cell that is timing out is this:

mol = mol_toolkit.MolFromSmiles('CC')
atom_index_list = get_smirks_dict(mol, smirks_list)
# ethane only has two matching SMIRKS patterns
print(atom_index_list.keys())

Here is the code for get_smirks_dict it is a couple of cells up

def  get_smirks_dictget_smi (mol, smirks_list):
    """
    mol - chemper Mol object
    smirks_list - list of tuples (SMIRKS, label)

    Returns a dictionary of listes
    {label: [ {smirks_index: atom_index} ] }
    """
    temp_dict = dict()
    for smirks, label in smirks_list:
        for dic in mol.smirks_search(smirks):
            atom_tuple = tuple([dic[i+1].get_index() for i in range(len(dic))])
            temp_dict[atom_tuple] = label

    label_dict = dict()
    for atom_tuple, label in temp_dict.items():
        if label not in label_dict:
            label_dict[label] = list()

        label_dict[label].append({i+1: atom_idx for i, atom_idx in enumerate(atom_tuple) })

    return label_dict

This function might not be the MOST efficient way to do what I'm trying for, but it does not take 10 minutes, its within a couple seconds when I run locally.