Closed jthorton closed 2 weeks ago
Some code I have been using in testing while validating the method
from kartograf import SmallMoleculeComponent
from kartograf.atom_aligner import align_mol_shape
from openff.toolkit import Molecule
from rdkit import Chem
# Preprocessing from Smiles - Here you can add your Input!
smiles = ["c1ccccc1-c2ccccc2", "C1Cc2cc(-c3ccccc3)ccc2C1"]
rdmols = [Chem.MolFromSmiles(s) for s in smiles]
rdmols = [Chem.AddHs(m, addCoords=True) for m in rdmols]
[Chem.rdDistGeom.EmbedMolecule(m, useRandomCoords=False, randomSeed = 0) for m in rdmols]
# Build Small Molecule Components
molA, molB = [SmallMoleculeComponent.from_rdkit(m) for m in rdmols]
# Align the mols first - this might not needed, depends on input.
a_molB = align_mol_shape(molB, ref_mol=molA)
from kartograf import KartografAtomMapper
# Build Kartograf Atom Mapper
mapper = KartografAtomMapper(atom_map_hydrogens=True)
# Get Mapping
kartograf_mapping = next(mapper.suggest_mappings(molA, a_molB))
print(kartograf_mapping.componentA_to_componentB) #print mapping
kartograf_mapping
with the changes in this PR we get the following mapping compared with the mapping from the main branch
my main concern with this change is that it breaks all of the tests which check the naphtalene->benzene mapping which seems like a small regression so would we want a flag to turn this extra filtering on?
Hi @jthorton my oppinion is: to not make this new filter a standard thing, as it is rather harsh on limiting the atom mapping regions, and I would consider this (depending on your use case) is not always necessary. I think either adding a flag or just leave the filter in the filter collection, so Users (@JenkeScheen) that want to use this filter, can use it, is the way to go :)
@IAlibay what do you think? :)
Re: new filter - I agree, this would be the best way to do a breaking change without removing the old behaviour completely.
Re: default behaviour - this might need more discussion. I can see both viewpoints here. For the current protocols we have, we would want to avoid ring breaks because it's undefined behaviour, but I know @hannahbaumann and I had discussed that there were some cases where it might be worth having an incorrect transformation than none at all.
Main things I'd really away here are: 1) Let's make this a new filter 2) Let's discuss that we want the user experience to be
Thanks for the discussion @RiesBen and @IAlibay, I agree that this does limit the number of mappings but I think this now follows the best practice advice and might be a sensible default in future. An extra filter and an option in the init to use this filter would make sense for now I'll add those. For some more context I looked into theLomapAtomMapper
and PersesAtomMapper
and found that lomap gave the wrong behaviour and broke the rings
and that perses gave the expected behaviour depending on if the allow_ring_breaking
flag was set:
allow_ring_breaking=True
allow_ring_breaking=False
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 96.60%. Comparing base (
2112a89
) to head (9d21f25
). Report is 11 commits behind head on main.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@IAlibay @hannahbaumann @RiesBen I think this is now ready for a final review, please check that the docs I added make sense to provide more information on this change.
An attempt at fixing #44 by enforcing that all rings an atom is present in must be present in the final mapping which should mean that fused rings are handled correctly.