Acellera / htmd

HTMD: Programming Environment for Molecular Discovery
https://software.acellera.com/docs/latest/htmd/index.html
Other
253 stars 58 forks source link

Support request for HTMD #1027

Closed RaulFD-creator closed 2 years ago

RaulFD-creator commented 2 years ago

To the support team of Accelera,

I am trying to use the HTMD module for creating a voxelized representation of proteins for subsequent DCNN training. I have looked at the documentation but I still have some problems that I was hoping you could help me bypass:

  1. I am trying to remove surfanctants, ligands, and other elements as part of the preprocessing and to do so I am using the molecule.remove() function, but it means I have to create an extensive list of all possible substances that may not belong to the protein, and was wondering if there was another method not so laborious.
  2. In some proteins it indicates that TYR or ASN residues do not belong to the protein. This type of problem I do not know how to handle as I am trying to create a pipeline to massively process a large database and cannot handle them individually.
    1. I want to select only 6 out of the 8 possible channels for the voxelization and I am using the method suggested in the documentation (userchannels=(mol.natoms, nchannels) with the channels I want set to 1 and those I do not to 0), but the output still has 8 channels. I am not sure whether I am properly selecting the channels or if there is a simpler way.

Thank you for your time,

stefdoerr commented 2 years ago

Hi,

  1. mol.remove("not protein") should remove all non-protein
  2. This indicates a broken structure probably. Check if they have N CA C O atoms (which are called exactly like that) for the backbone
  3. Just do the whole voxelization and then only select the channels you want from the voxels by indexing the array: i.e. voxels[:, [0, 3, 4]]
RaulFD-creator commented 2 years ago

Hi,

Perfect, will do.

Thank you very much.