gruenewald-lab / CGsmiles

Coarse-Grained Smiles (CGsmiles) for representing abitrarily complex molecules using a compact line notation
4 stars 1 forks source link

Resolution layering #13

Closed fgrunewald closed 2 months ago

fgrunewald commented 2 months ago

implements the resolution layering; this is branched off rings_new

pckroon commented 2 months ago

Code looks good to me. I suggest getting rid of the fragment_strs as quickly as possible, and supplementing the given fragment_dicts with it. This simplifies the __init__ as well as resolve a bit. For example:

if len(fragment_dicts) != len(fragment_strs): raise IndexError
self.fragment_dicts = []
for f_dict, f_str in zip(fragment_dicts, fragment_strs):
    f_dict = copy(f_dict)
    f_dict.update(read_fragments(f_str))
    self.fragment_dicts.append(f_dict)
fgrunewald commented 2 months ago

In the end, I think it makes the most sense to allow fragments to be provided as dict or str. Mixing both leads to all sorts of sanity checks we can avoid. It is so easy to split the string and populate a fragment dict together with some recycled fragments that I don't think this feature is needed. Hence the error message

fgrunewald commented 2 months ago

@pckroon possibly but I think it is not worth doing at the moment. 99% of applications will be resolving a cgsmiles string where both fragments and the meta_graph are in the string. For those applications putting stuff in the init is sufficiently easy and takes little python knowledge.

The "advanced" API functions of providing a meta_graph or fragment list are mostly for polyply/vermouth. Even there I don't have a real use case yet. Perhaps at some point mapping files can be replaced and then it would make sense having an advanced API. That's quite a bit in the future in my opinion.

fgrunewald commented 2 months ago

@fgrunewald I refactored the API as suggested using constructors. That is pretty neat. Additionally, one can have more advanced workflows by subclassing and adding a new constructor. I think that is actually even more handy. The only thing I'm now considering is having a simple resolve function as a wrapper like so:

def resolve(cgsmiles_str, all_atom):
      _, molecule = MoleculeResolver.from_string(cgsmiles_str, last_all_atom=all_atom).resolve_all()
     return molecule