Isra3l / ligpargen

MIT License
55 stars 20 forks source link

Limitations on the max size of a molecule? #27

Open iGulitch opened 3 weeks ago

iGulitch commented 3 weeks ago

At the web page of the web-based Ligpargen service, it is written "Maximum ligand size allowed is 200 atoms". This is understandable. However, I thought naively that the stand alone version of Ligpargen did not have this limitation. And this is not the case unfortunately. Namely, too big molecule submitted to Ligpargen leads to an error :

Traceback (most recent call last):
  File "/opt/conda/envs/ligpargen/bin/ligpargen", line 33, in <module>
    sys.exit(load_entry_point('LigPargen', 'console_scripts', 'ligpargen')())
  File "/ligpargen/ligpargen/ligpargen.py", line 504, in main
    molname=args.molname, workdir= args.path, debug= args.debug)
  File "<string>", line 17, in __init__
  File "/ligpargen/ligpargen/ligpargen.py", line 192, in __post_init__
    moleculeA = Molecule.fromBOSS(zmatName, outFile, pdbFile, moleculeA.shiftX, moleculeA.shiftY, moleculeA.shiftZ)
  File "/ligpargen/ligpargen/topology/Molecule.py", line 159, in fromBOSS
    atoms, numberOfStructuralDummyAtoms = cls._getAtoms(cls, zmatData, pdbfile)
  File "/ligpargen/ligpargen/topology/Molecule.py", line 496, in _getAtoms
    atomsLines = re.search(r'BOSS(.*?)Geometry', zmatData, re.DOTALL).group().splitlines()[1:-1]
AttributeError: 'NoneType' object has no attribute 'group'

I had to dig into the Ligpargen code to figure out that the failure happens due at that stage, where BOSS is involved. Having read BOSS manual, I figured out that there were some limitations on the molecule size, but I didn't get them clearly.

May I ask to clarify the internal limits of BOSS and of Ligpargen [ if there are any ] on the maximum number of atoms in a molecule?

atomrq commented 3 weeks ago

I got the same problem. The Ligpargen offlien version can not process large molecule with atoms more than 200. I also wonder is is a rigrous limitation in the code?

iGulitch commented 3 weeks ago

@atomrq , I couldn't find any limitations embedded in the Ligpargen code itself. What I figured is the following :

  1. Ligpargen hinges on RDKit and OpenBabel when it comes to molecule processing, e.g. reconstruction of a molecule based on the SMILES sequence. For instance, check utilities.py. RDKit and OpenBabel themselves might struggle with a large molecules processing.
  2. Ligpargen calls BOSS at some point, which has its own internal limitations [ read the BOSS manual ] .

I'm looking forwards to seeing the reply from the developer [-s] :)