MaterSim / pyocse

Organic Simulation Toolkit
MIT License
1 stars 2 forks source link

Problem in co-crystal system #10

Open qzhu2017 opened 7 months ago

qzhu2017 commented 7 months ago

@shinnosukehattori

If we are dealing with a co-crystal, we used the following codes to store the molecular topology information https://github.com/MaterSim/OST/blob/332976973ea0ccf818245c1a1f0e7975e391f204/ost/parameters.py#L1917-L1924

Here you used the reduce function, this is fine for the case of gaff. However, for the case of openff, there maybe an issue. E.g., In the following example, I have two molecules,

If we use the reduce function, it seems that only 1 C1 will be left. Do you know if there is a solution to avoid such merge?

In the following example, I expect to see 17+11 = 24 atom types. But it shows 24 after merge.

>>> from pyxtal.db import database
>>> from ost.parameters import ForceFieldParameters
>>> db = database("../HT-OCSP/benchmarks/test.db")
>>> style = 'openff'
>>> xtal = db.get_pyxtal("XATJOT")
>>> smiles = [mol.smile for mol in xtal.molecules]
>>> params = ForceFieldParameters(smiles, style=style)
>>> len(params.ff.molecules[0].parameterset.atom_types)
17
>>> len(params.ff.molecules[1].parameterset.atom_types)
11
>>> for at in params.ff.molecules[0].parameterset.atom_types: print(at)
... 
N1
C2
C3
C4
C5
C6
C7
C8
N9
C10
H11
H12
H13
H14
H15
H16
H17
>>> for at in params.ff.molecules[1].parameterset.atom_types: print(at)
... 
O1
C2
O3
C4
C5
C6
O7
O8
H9
H10
H11
>>> from functools import reduce
>>> from operator import add
>>> mols = reduce(add, params.ff.molecules)
>>> for at in mols.parameterset.atom_types: print(at)
... 
N1
C2
C3
C4
C5
C6
C7
C8
N9
C10
H11
H12
H13
H14
H15
H16
H17
O1
O3
O7
O8
H9
H10
H11D
>>> len(mols.parameterset.atom_types)
24
shinnosukehattori commented 7 months ago

thanks, it can aviold with adding the unique suffix to label. i will update

shinnosukehattori commented 7 months ago

Because of the change in the specification of parmed and the addition of a feature called deduplicate (paramed/structure.py), identical label atoms of basically different molecular species are marked with a D (see H11D), essentially avoiding confusion. The one exception is that if all parameters of the atomic type are identical, adding D is not happen.

Therefore, there is basically no problem using current settings for co-crystal. If you want to completely separate the atomic types between different molecules, you can do so by adding the following code

        for i, molecule in enumerate(self.ff.molecules):
            for at in molecule.atoms:
                at.atom_type.number = i

Currently the atom_type.number is not used. If this value is different for different molecules, D will always be given if the atom_type does not match and the label is the same.