salilab / imp

The Integrative Modeling Platform
https://integrativemodeling.org
GNU General Public License v3.0
73 stars 30 forks source link

Running Multifit without providing enough information to fill the volume #803

Open jbluttrell opened 10 years ago

jbluttrell commented 10 years ago

Using the techniques described here, https://integrativemodeling.org/nightly/doc/tutorial/multifit_3sfd.html, I was just wondering if it is possible to run multifit without enough pdb files to make a complete representation of the volume. For example, what if in the example in the link, I left out two of the PDB files and only had 3sfdA and 3sfdB. After making the appropriate changes to subunits.txt and the other places, I have gotten some errors right at the end before models were going to outputted. I know there are ways to insert filler shapes into the volume to create a barrier of sorts or otherwise restrict possible confirmations from entering a certain volume, but should that be necessary to even run Multifit.

Here is my input and output using the files of the mentioned example on the IMP tutorial.

multifit.py refine_fft 3sfd.asmb.input 3sfd.asmb.input.refined 3sfd.asmb.proteomics 3sfd.indexes.mapping.input 3sfd.asmb.combinations 0

number of found assignments:0 Number of found assignments :0 after align begin Model 0::do_destroy: WARNING Object "DataObject 0" was never used. See the IMP::Object documentation for an explanation. WARNING Object "DataObject 1" was never used. See the IMP::Object documentation for an explanation. end Model 0::do_destroy WARNING Object "ProteomicsEMAlignmentAtomic0" was never used. See the IMP::Object documentation for an explanation.

multifit.py models -m 5 3sfd.asmb.input 3sfd.asmb.proteomics 3sfd.indexes.mapping.input 3sfd.asmb.combinations model

Traceback (most recent call last): File "/usr/local/bin/multifit.py", line 13, in main() File "/usr/local/bin/multifit.py", line 10, in main c.main() File "/usr/local/lib/IMP-python/IMP/kernel/init.py", line 7041, in main self.do_command(command) File "/usr/local/lib/IMP-python/IMP/kernel/init.py", line 7093, in do_command mod.main() File "/usr/local/lib/IMP-python/IMP/multifit/models.py", line 48, in main run(args[0], args[1], args[2], args[3], args[4], options.max) File "/usr/local/lib/IMP-python/IMP/multifit/models.py", line 28, in run combs = IMP.multifit.read_paths(combs_fn) File "/usr/local/lib/IMP-python/IMP/multifit/init.py", line 1900, in read_paths return _IMP_multifit.read_paths(*args) RuntimeError: bad lexical cast: source type value could not be interpreted as target umc-251243:3sfdTest luttrell$


Is there an option to enable doing something like this, or is it possible that I have just missed a step? Thanks.

benmwebb commented 10 years ago

Of course the last step won't work, since the refine_fft returned zero assignments. You might be able to make that work by tweaking some of the MultiFit parameters. It was certainly not designed to handle missing regions, but to the best of my knowledge there's no penalty for "unassigned" volume, so it should work.

jbluttrell commented 10 years ago

I am looking into this right now actually. I was curious, so I started testing some scenarios where a region of residues has been deleted from a structure in a pdb file. I started by taking a random pdb file like 3sfd and using Chimera to generate a map of the structure (molmap command). Then I deleted residues from the original pdb file until multifit reached zero assignments for the generated map. What I may have found is that this occurs at a similar place for many pdb structures and always (that I have seen) occurs at one point (a residue). At this single residue point, multifit can find assignments with deletion up until that point but with residue deletion after that point. Furthermore this point could occur at similar points in different structures. (I have seen four cases of this in different structures). Does this sound interesting or useless? Thanks