MobleyLab / chemper

Repository for Chemical Perception Sampling Tools
MIT License
19 stars 10 forks source link

SMIRKSifier update to make removing decorators more efficient #45

Closed bannanc closed 5 years ago

bannanc commented 5 years ago

In this pull request I'm updating the SMIRKSifier code to make changes in decorators more efficient. This addresses issue #40 by extending the types of options for removing decorators and or atoms. After creating the SMIRKS patterns we use the ChemicalEnvironments way of splitting up decorators on atoms and bonds. That is the SMIRKS decorators on atoms and bonds are split into ORtypes and ANDtypes. ORtypes are further split into a base and sub-decorators.

Here is a specific example I'll use to show the types of moves: "[#6X4+0,#7X3;A;!r:1]-,=;!@[#6]" Is split up inside ChemicalEnvironments: atom 1

There are three layers of decisions:

  1. The first choice is which atom or bond to change.
  2. Remove ANDtypes, ORtypes, or an atom
  3. Specific types of change to the decorators. I will describe each step here:

1. chose an atom or bond

Before, I did this by just setting the probability of choosing a bond to be really low. Now, the number of decorators on the atom or bond are used to chose the type of move. The idea being that the more decorators currently on an atom or bond the more likely you should try taking one off. This also means that a fully generic atom ([*]) or bond (~) will never be picked.

2. Change ANDtypes, ORtypes, or an atom

These options are only considered if the atom or bond has the decorator type being considered for changing AND or OR types. In order to try removing an atom, the atom needs to be unindexed (that is it doesn't have a :n in the SMIRKS) and it must have only one neighbor. That is an unindexed atom that connects two atoms will never be removed.

3. Specific types of removal

I will split this section up into the three categories above a) remove an atom, b) change AND types and c) change OR types.

3a. remove an atom:

Remove the unindexed atom

3b. Remove some AND types

There are two choices here:

3c. Remove some OR types

For a bond

For an atom

I have added tests for most of these sub options that pass locally. I just need to update the relevant jupyter notebook.

codecov[bot] commented 5 years ago

Codecov Report

Merging #45 into master will decrease coverage by 0.12%. The diff coverage is 95.2%.

bannanc commented 5 years ago

I am worried about the number of open PRs that build on each other. I am going to merge them without review. Any suggested changes from reviews will be added in the next round of PRs.

bannanc commented 5 years ago

For the record I'm putting 👍 when I've added a comment to issue #55 for addressing in my next PR