m3g / packmol

Packmol - Initial configurations for molecular dynamics simulations
http://m3g.github.io/packmol
MIT License
222 stars 51 forks source link

Max number of atoms #13

Closed mikemhenry closed 6 years ago

mikemhenry commented 6 years ago

I'm having some issues with packmol when working with a large number of atoms (4 million).

The first few lines look okay

HETATM    1 A    RES A   1    -182.494-558.693-197.990  0.00  0.00          EP  
HETATM    2 A    RES A   2    -107.377-907.034 724.917  0.00  0.00          EP  
HETATM    3 A    RES A   3     371.371 -77.868 363.000  0.00  0.00          EP  

These are the last 4 lines of the PDB file:

HETATM*****  C   RES #   4     743.692 326.791  -7.938  0.00  0.00           C  
HETATM*****  C   RES #   4     743.613 325.393  -7.954  0.00  0.00           C  
HETATM*****  C   RES #   4     743.534 323.996  -7.971  0.00  0.00           C  
END

I think there might be some overflow issues, since if I open the pdb file with VMD and re-save it, the last 4 lines look like:

ATOM  f423e  C   RES #   4     743.692 326.791  -7.938  0.00  0.00           C
ATOM  f423f  C   RES #   4     743.613 325.393  -7.954  0.00  0.00           C
ATOM  f4240  C   RES #   4     743.534 323.996  -7.971  0.00  0.00           C
END 

So it looks like VMD uses a different scheme for large atom numbers. VMD is able to work with the pdb file that packmol creates, but other tools like MDTraj and ParmEd can't handle the packmol pdb files for large atom numbers.

I'm not an expert in fortran, but it seems like an overflow problem and limitation of the PDB specification.

Here I noticed is where the issue first starts

HETATM 9998 A    RES A9998    -797.121 686.914-714.515  0.00  0.00          EP      
HETATM 9999 A    RES A9999    -595.086-876.518-377.213  0.00  0.00          EP      
HETATM10000 A    RES B   1     390.189-493.168  -1.496  0.00  0.00          EP

and eventually

HETATM99998 A    RES K   8    -346.923  37.050 501.989  0.00  0.00          EP      
HETATM99999 A    RES K   9     722.664 659.707 938.134  0.00  0.00          EP      
HETATM***** A    RES K  10    -616.218-408.040 448.630  0.00  0.00          EP      
HETATM***** A    RES K  11     344.199 889.359 -18.729  0.00  0.00          EP      

I'm not sure the best way to fix this.

Thanks! Mike

mikemhenry commented 6 years ago

Just for completeness, here the the beginning of packmol output

################################################################################

 PACKMOL - Packing optimization for the automated generation of
 starting configurations for molecular dynamics simulations.

                                                              Version 18.103 

################################################################################

  Packmol must be run with: packmol < inputfile.inp 

  Userguide at: www.ime.unicamp.br/~martinez/packmol 

  Reading input file... (Control-C aborts)
  Seed for random number generator:        12345
  Output file: out.pdb
  Reading coordinate file: As.pdb
  Reading coordinate file: Bs.pdb
  Reading coordinate file: Cs.pdb
  Number of independent structures:            3
  The structures are: 
  Structure            1 :As.pdb(           1  atoms)
  Structure            2 :Bs.pdb(           1  atoms)
  Structure            3 :Cs.pdb(          10  atoms)
  Maximum number of GENCAN loops for all molecule packing:          600
  Total number of restrictions:            3
  Distance tolerance:    2.0000000000000000     
  Warning: Type of residue numbering not set for structure            1
  Residue numbering set for structure            1 :           0
  Swap chains of molecules of structure            1 : F
  Warning: Type of residue numbering not set for structure            2
  Residue numbering set for structure            2 :           0
  Swap chains of molecules of structure            2 : F
  Warning: Type of residue numbering not set for structure            3
  Residue numbering set for structure            3 :           0
  Swap chains of molecules of structure            3 : F
  Number of molecules of type            1 :       400000
  Warning: There will be more than 9999 molecules of type            1
           Residue numbering is reset after 9999. 
  Each set be will be assigned a different chain in the PDB output file. 
  Number of molecules of type            2 :       200000
  Warning: There will be more than 9999 molecules of type            2
           Residue numbering is reset after 9999. 
  Each set be will be assigned a different chain in the PDB output file. 
  Number of molecules of type            3 :        40000
  Warning: There will be more than 9999 molecules of type            3
           Residue numbering is reset after 9999. 
  Each set be will be assigned a different chain in the PDB output file. 
  Total number of atoms:      1000000
  Total number of molecules:       640000
  Number of fixed molecules:            0
  Number of free molecules:       640000
  Number of variables:      3840000
  Total number of fixed atoms:            0
  Maximum internal distance of type            1 :    0.0000000000000000     
  Maximum internal distance of type            2 :    0.0000000000000000     
  Maximum internal distance of type            3 :    12.599953174516164     
  All atoms must be within these coordinates: 
   x: [   -1000.0000000000000      ,    1000.0000000000000       ] 
   y: [   -1000.0000000000000      ,    1000.0000000000000       ] 
   z: [   -1000.0000000000000      ,    1000.0000000000000       ] 
  If the system is larger than this, increase the sidemax parameter. 

################################################################################

I do notice the warnings about the repeat in residue number, but there does not seem to be a warning about the maximum number of atoms.

lmiq commented 6 years ago

Dear Mike,

Effectively, the problem is that the standard PDB format, to my knowledge, does not accept atom numbers greater than 9999. VMD and most packages ignore this column, so I never thought this could be an issue.

I couldn't find any indication of an alternative, widespread and useful, way to number the atoms. You mention that VMD uses those strings. Do the other packages you mention, which fail to use pdb files generated by Packmol, work with the VMD numbering style?

It would be easy to implement any of these alternatives, I just would need to know if they are generally accepted and where can I find the exact specifications.

Leandro.

lmiq commented 6 years ago

Dear Mike,

Actually that was quite straightforward. The numbers in VMD are printed in hexadecimal format if greater than 99999. I have updated Packmol now to do the same.

Leandro.

lmiq commented 6 years ago

Version 18.104 contains the fix. Please let me know if that works for you. All the best. Leandro.

mikemhenry commented 6 years ago

Thanks! I built version 18.104 and that fixed my original issue. I did notice a new one now, here are some examples:

ATOM 186A9 RES K 19 ********-428.237 87.4940.00 0.00 EP ATOM 186D6 RES K 64 -257.333 84.365********0.00 0.00 EP ATOM 186F9 RES K 99 ********-934.409********0.00 0.00 EP

I'm not sure now if we are running into the limits of the PDB file format, but it looks light now we have an overflow in the atom coordinates. I'm not sure how vmd and other packages handle this, but I'll look into it.

mikemhenry commented 6 years ago

I think at this point we are hitting a limit of the PDB file format. Thank you very much for your prompt fix on the residue numbers, I really appreciate it!

aieng-imran commented 3 years ago

Warning: Type of residue numbering not set for structure image I am not sure how to resolve this issue. I am only using 500 molecules altogether

lmiq commented 3 years ago

The warning can be ignored. The other is most likely an input file error. Please post the input file so I can check.