choderalab / openmoltools

An open set of tools for automating tasks relating to small molecules
MIT License
63 stars 30 forks source link

Change forcefield_generators.generateOEMolFromTopologyResidue to use molecule_to_mol2 #284

Open jchodera opened 5 years ago

jchodera commented 5 years ago

openmoltools.forcefield_generators.generateOEMolFromTopologyResidue currently uses the default OpenEye mol2 writer which generates <0> for substructure names, causing problems with antechamber like

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbb in position 35: invalid start byte

To fix this, we need to change this section to instead use

from openmoltools.openeye import molecule_to_mol2
molecule_to_mol2(molecule, mol2_input_filename)
jchodera commented 5 years ago

Wait, that's not correct. We have to use the low-level writer since this is then being fed into antechamber for bond order perception (which isn't really a good idea anyway).

Instead, I think we need to rewrite the substructure name <0> with

    # Replace <0> substructure names with valid text.                                                                                                                                                                                  
    infile = open(mol2_input_filename, 'r')
    lines = infile.readlines()
    infile.close()
    newlines = [line.replace('<0>', residue.name) for line in lines]
    outfile = open(mol2_input_filename, 'w')
    outfile.writelines(newlines)
    outfile.close()

though I note this still gives me

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfb in position 226: invalid start byte

which seems to be because the Tripos atom types are incorrect:

@<TRIPOS>ATOM
      1 C1          0.0000    0.0000    0.0000           1 MOL         0.0000
      2 C3          0.0000    0.0000    0.0000           1 MOL         0.0000
jchodera commented 5 years ago

I think this function just isn't a good idea at all. We should probably just deprecate it.