BradyAJohnston / MolecularNodes

Toolbox for molecular animations in Blender, powered by Geometry Nodes.
https://bradyajohnston.github.io/MolecularNodes/
GNU General Public License v3.0
909 stars 85 forks source link

TypeError: unsupported operand type(s) for *: 'NoneType' and 'float' #484

Open alphataubio opened 6 months ago

alphataubio commented 6 months ago

Describe the bug loading MD lammps cg spica trajectory gives error and does not load.

To Reproduce Topology: 4ypg-e.psf Trajectory: 4ypg-20fs.lammpsdump "Load"

Expected behavior load md lammps trajectory without error

Error Codes

Read prefs: "/Users/mitch/Library/Application Support/Blender/4.1/config/userpref.blend"
/Users/mitch/Library/Application Support/Blender/4.1/scripts/addons/molecularnodes
Template Installed () from '/Users/mitch/Library/Application Support/Blender/4.1/scripts/addons/molecularnodes/assets/template/Molecular Nodes.zip' into '/Users/mitch/Library/Application Support/Blender/4.1/scripts/startup/bl_app_templates_user/'
/Users/mitch/Library/Application Support/Blender/4.1/scripts/addons/molecularnodes
Template Installed () from '/Users/mitch/Library/Application Support/Blender/4.1/scripts/addons/molecularnodes/assets/template/Molecular Nodes.zip' into '/Users/mitch/Library/Application Support/Blender/4.1/scripts/startup/bl_app_templates_user/'
/Applications/Blender.app/Contents/Resources/4.1/python/lib/python3.11/site-packages/MDAnalysis/coordinates/LAMMPS.py:598: UserWarning: Reader has no dt information, set to 1.0 ps
  ts.data['time'] = step_num * ts.dt
Traceback (most recent call last):
  File "/Users/mitch/Library/Application Support/Blender/4.1/scripts/addons/molecularnodes/io/md.py", line 132, in execute
    mol, universe = load(
                    ^^^^^
  File "/Users/mitch/Library/Application Support/Blender/4.1/scripts/addons/molecularnodes/io/md.py", line 100, in load
    mol = session.show(atoms=universe,
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mitch/Library/Application Support/Blender/4.1/scripts/addons/molecularnodes/io/parse/mda.py", line 549, in show
    mol_object = self._process_atomgroup(
                 ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mitch/Library/Application Support/Blender/4.1/scripts/addons/molecularnodes/io/parse/mda.py", line 758, in _process_atomgroup
    for att_name, att in ag_blender._attributes_2_blender.items():
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mitch/Library/Application Support/Blender/4.1/scripts/addons/molecularnodes/io/parse/mda.py", line 299, in _attributes_2_blender
    "value": self.vdw_radii,
             ^^^^^^^^^^^^^^
  File "/Users/mitch/Library/Application Support/Blender/4.1/scripts/addons/molecularnodes/io/parse/mda.py", line 178, in vdw_radii
    return np.array(
           ^^^^^^^^^
TypeError: unsupported operand type(s) for *: 'NoneType' and 'float'

Desktop (please complete the following information):

Additional context .psf and .lammpsdump with 3 frames attached as zip 4pyg-mwe.zip

rbdavid commented 6 months ago

This is likely arising due to removal of coarse-grained particle names from the elements dictionary or an element symbol being present in the dictionary but not having a vdw_radii key in the element's subdictionary.

For quick reference, the code in question is: https://github.com/BradyAJohnston/MolecularNodes/blob/fba40bddbee05e2891ca92908df9f311085f42b8/molecularnodes/io/parse/mda.py#L175-L181

Do you happen to have any heavy metal atoms in this structure?

rbdavid commented 6 months ago

Doesn't look like it. But the atom names listed in the .psf file are certainly not standard atom names and aren't being handled well before the vdw_radii method is called. Specifically, the unexpected coarse-grained particles' atom names are fed into the MDAnalysis.topology.guessers.guess_atom_elements() to guess an element symbol:

https://github.com/BradyAJohnston/MolecularNodes/blob/fba40bddbee05e2891ca92908df9f311085f42b8/molecularnodes/io/parse/mda.py#L153-L166

Applying this code to the coarse-grained particles in your system returns a set of "elements symbols":

elements = [x if x in mn.data.elements.keys() else MDAnalysis.topology.guessers.guess_atom_element(x) for x in all_sel.atoms.names]
print(set(elements))
> {'A', 'AL', 'B', 'C', 'G', 'H', 'I', 'L', 'M', 'P', 'S', 'T'}

This is a perfect example of the the MDA guesser failing on us. Related to #452 and #468. Iodine doesn't have a vdw_radii defined in its subdictionary.

alphataubio commented 6 months ago

This is likely arising due to removal of coarse-grained particle names from the elements dictionary or an element symbol being present in the dictionary but not having a vdw_radii key in the element's subdictionary.

where is this elements dictionary ? in MDAnalysis in python somewhere ?? looks like i have to add the spica cg amino acids to it manually.

Do you happen to have any heavy metal atoms in this structure?

no it's a protein with 687 residues, spica coarse grains each amino acids into 1-5 beads:

image

PSF 

       2 !NTITLE
* created by setup_lammps
* dummy

   1579 !NATOM
       1 MET     1 MET  GBTL GBTL   0.111800       56.0385
       2 MET     1 MET  MET  MET    0.000000       75.1543
       3 ALA     1 ALA  ABBL ABBL   0.000000       71.0732
       4 GLU     1 GLU  GBML GBML   0.000000       56.0385
       5 GLU     1 GLU  GLU  GLU   -0.111800       72.0526
       6 GLU     1 GLU  GBML GBML   0.000000       56.0385
       7 GLU     1 GLU  GLU  GLU   -0.111800       72.0526
       8 LEU     1 LEU  GBML GBML   0.000000       56.0385
       9 LEU     1 LEU  LEU  LEU    0.000000       57.1151
      10 VAL     1 VAL  GBML GBML   0.000000       56.0385
      11 VAL     1 VAL  VAL  VAL    0.000000       43.0883
...
rbdavid commented 6 months ago

The elements dictionary is in https://github.com/BradyAJohnston/MolecularNodes/blob/main/molecularnodes/data.py but the codes that I linked to above are using MDAnalysis to fill in missing atom information when they're not included in the structure/topology files input to MolcularNodes.

I'm getting a fix pushed shortly.

alphataubio commented 6 months ago

Applying this code to the coarse-grained particles in your system returns a set of "elements symbols":

elements = [x if x in mn.data.elements.keys() else MDAnalysis.topology.guessers.guess_atom_element(x) for x in all_sel.atoms.names]
print(set(elements))
> {'A', 'AL', 'B', 'C', 'G', 'H', 'I', 'L', 'M', 'P', 'S', 'T'}

This is a perfect example of the the MDA guesser failing on us.

computer is only as dumb as what you tell it, i have to be explicit and provide the cg spica definitions somewhere.

Related to #452 and #468.

ok ill look at those other issues and try to contribute a fix.

Iodine doesn't have a vdw_radii defined in its subdictionary.

there's no iodine anywhere in this file, only the 20 standard amino acids. Since iodine has an atomic weight of ~127, then the code appears to be matching based on weight so a side chain with the backbone of MET say with two cg beads for a total weight of 56+75=131 is getting matched as iodine incorrectly

rbdavid commented 6 months ago

Yeah, the MDAnalysis guesser functions get pretty desperate to fill in missing information. The element guesser seems to be doing some strange things.

alphataubio commented 6 months ago

Yeah, the MDAnalysis guesser functions get pretty desperate to fill in missing information. The element guesser seems to be doing some strange things.

i have to merge the spica cg beads information into data.py, this way no guessing required.

see attached json: spica_top.json

alphataubio commented 6 months ago

@rbdavid i have vdw radii up to 99 from pubchem, and the rest are available in mathematica. ill fork this repo and add all the radii to data.py later tonight.

rbdavid commented 6 months ago

I only glanced at the elements data in Wolfram Mathematica; if I recall correctly, I was unable to find the sources for those values. MDAnalysis also includes a table of vdw radii values but its very limited. If you have the time, check the vdw radius values against the papers I link to in the #468. Filling out the elements dictionary would be greatly appreciated.

But in this case, the vdw_radii values listed in the elements dictionary will not be accurate for your system. In fact, the current implementation of the MDAnalysis.topology.guessers.guess_atom_element() is assigning incorrect element symbols to your coarse-grained particles. These incorrect element symbols are then used to grab vdw_radii values from the MolecularNodes elements dictionary. If the element symbol is not represented in the element dictionary, then a default value of 100 picometers is used for the particle's vdw_radii value. This may affect your Blender visualizations so be wary.

alphataubio commented 6 months ago

@rbdavid how do you run this python code:

elements = [x if x in mn.data.elements.keys() else MDAnalysis.topology.guessers.guess_atom_element(x) for x in all_sel.atoms.names]
print(set(elements))
> {'A', 'AL', 'B', 'C', 'G', 'H', 'I', 'L', 'M', 'P', 'S', 'T'}

i tried pasting it in the blender python window but i got:

mn.data.elements.keys() Traceback (most recent call last): File "", line 1, in NameError: name 'mn' is not defined

elements = [x if x in mn.data.elements.keys() else MDAnalysis.topology.guessers.guess_atom_element(x) for x in all_sel.atoms.names] Traceback (most recent call last): File "", line 1, in NameError: name 'all_sel' is not defined

print(set(elements)) Traceback (most recent call last): File "", line 1, in NameError: name 'elements' is not defined

rbdavid commented 6 months ago

TLDR: Coarse-grained particles should never be mapped to elements but they currently are when importing such a structure file into MolecularNodes. Basically, don't trust the assigned element, mass, nor vdw_radii values listed in the respective vertex attributes for a CG'd system. They're gonna be wrong. Your visualizations will likely be affected.

Oh shoot. The code snippet that you are trying to use is a part of a small ipython instance that I ran to debug what was happening for this error. It highlights a deeper, fundamental bug in how MN (but really MDAnalysis) fills in missing information about atoms' element labels. Its not something I was doing within blender or the MolecularNodes python API. I'll try to recreate it here in full:

import MDAnalysis
import molecularnodes as mn
import pprint

u = MDAnalysis.Universe('4pyg-e.psf','4pyg-20fs-3frames.lammpsdump')
all_sel = u.select_atoms('all')

print(sorted(set(all_sel.atoms.names)))
> ['ABB', 'ABBL', 'ABBS', 'AR1', 'AR2', 'ASN', 'ASP', 'CYS', 'GBB', 'GBBL', 'GBBS', 'GBM', 'GBML', 'GBMS', 'GBTL', 'GLN', 'GLU', 'HI1', 'HI2', 'HI3', 'ILE', 'LEU', 'LY1', 'LY2', 'MET', 'PH1', 'PH2', 'PH3', 'PH4', 'PRO', 'SER', 'THR', 'TR1', 'TR2', 'TR3', 'TY1', 'TY2', 'TY3', 'TY4', 'VAL']

elements = [x if x in mn.data.elements.keys() else MDAnalysis.topology.guessers.guess_atom_element(x) for x in all_sel.atoms.names]
print(sorted(set(elements)))
> ['A', 'AL', 'B', 'C', 'G', 'H', 'I', 'L', 'M', 'P', 'S', 'T']

pprint.pp(list(zip(all_sel.atoms.names, elements)))

The mn.data.elements dictionary only contains information about elements so none of the coarse-grained particles in your structure file will be parsed by that. Those particles names are therefore handed over to the MDAnalysis guesser function ((source code) [https://docs.mdanalysis.org/stable/_modules/MDAnalysis/topology/guessers.html#guess_atom_element]) that does some pretty loose guessing, as you can see. For example, the VAL particles in your structure are mapped to an AL element symbol. and your gut reaction might be "HUH?!?!" That makes no sense when we know the context of the coarse grain parameters/particles.

But, MDAnalysis nor MN can know that context easily; there are literally too many all-atom and coarse grained force fields out there to have a guesser that covers all that ground without ambiguity/conflicts in name spaces. Like, the atom name CD maps to a delta carbon atom name in most all-atom force fields but also gets mapped to a particle name in the Martini coarse grain force field. Its a mess! and a nontrivial amount of work to sort everything out.

Once PR #485 is merged with main, the fatal error you are seeing that stems from all this weirdness should be squashed. You'll be able to load your structure in just fine. But! and this is a big but, none of your elemental data values will be accurate. They shouldn't even be considered. The assigned element, vdw_radii, nor mass values for vertices will be wrong. If you have the correct data for the CG-equivalent mass and radius values, I highly suggest you correct those attributes so that, when they are used to visualize your system, you are looking at the correct visualization. Right now, your VAL particles are likely getting the mass and radius values from aluminum, while others are getting hydrogen's or boron's.

rbdavid commented 6 months ago

If anyone knows a gpt-like or natural language processing tool that can guess force-field context (all-atom versus coarse-grained, or GROMOS versus AMBER force fields) from a structure file, that might be a great first step to solving this problem.

alphataubio commented 6 months ago

['ABB', 'ABBL', 'ABBS', 'AR1', 'AR2', 'ASN', 'ASP', 'CYS', 'GBB', 'GBBL', 'GBBS', 'GBM', 'GBML', 'GBMS', 'GBTL', 'GLN', 'GLU', 'HI1', 'HI2', 'HI3', 'ILE', 'LEU', 'LY1', 'LY2', 'MET', 'PH1', 'PH2', 'PH3', 'PH4', 'PRO', 'SER', 'THR', 'TR1', 'TR2', 'TR3', 'TY1', 'TY2', 'TY3', 'TY4', 'VAL']

where should i add these in data.py ? in elements or coarse_grain_particles ??

when i grep the whole MN repo, "coarse_grain_particles" only shows up in data.py. is it even used somewhere ?

rbdavid commented 6 months ago

That coarse_grain_particles dictionary is a stash of entries to a previous version of the elements dictionary that did contain Martini coarse grain information. It is not used anywhere at the moment.

You can place the SPICA coarse-grain particle information into that dictionary since, in my opinion, it should not go into the elements dictionary. Still, we then need to do some reworking in the mda.py (see below) and molecule.py code to somehow flip a switch to say "use coarse-grained dictionary instead of element dictionary" while assigning particle attribute values. This gets a bit complicated because I think its possible to mix coarse-grained particles into all-atom systems. Its just a mess.

https://github.com/BradyAJohnston/MolecularNodes/blob/fba40bddbee05e2891ca92908df9f311085f42b8/molecularnodes/io/parse/mda.py#L153-L166

rbdavid commented 6 months ago

Alternatively, you posted a json file earlier that contains all relevant information about the SPICA CG force field. It should be feasible/possible to analyze that json file and fill in incorrectly assigned attribute values within the Blender instance so that you're working with the expected set of values.

I'm not familiar enough with the SPICA force field to say what those values are or what not. I can help with parsing the json and sending data to attributes but not so much with considering the values associated with that model.

alphataubio commented 6 months ago

im working on adding all the spica beads into elements dictionary starting with atomic number 1000, ill make a draft PR when data.py is ready

alphataubio commented 6 months ago

added vdv_radii up to 99 and spica cg particles to data.py see PR 486

SOURCES: