volkamerlab / opencadd

A Python library for structural cheminformatics
https://opencadd.readthedocs.io
MIT License
89 stars 18 forks source link

Why might the aligner be failing on these PDB files? #155

Open noahharrison64 opened 1 year ago

noahharrison64 commented 1 year ago

Hi, I've been having issues using the MDAnalysisAligner on a certain subset of PDB files, found here. These files have been renumbered to have consistent residue numbering and also been protonated using the protein.plus webservice, protoss.

The code cell I'm running to attempt alignment is as follows:

protonated_pdbs = ['4e5w_renum_protoss.pdb', '5wo4_renum_protoss.pdb']
structures = [Structure.from_string(f) for f in protonated_pdbs]
user_select = ["backbone and name CA", "backbone and name CA"]
out = align(structures,
            user_select, method=METHODS["mda"])

This throws the following error:

    142     coverage = len(ref_atoms)
    143 else:
--> 144     raise ValueError(
    145         "The number of atoms to match has to be the same for both structures."
    146     )
    148 # Compute initial RMSD (no preprocessing)
    149 initial_rmsd = rms.rmsd(ref_atoms.positions, mobile_atoms.positions)

ValueError: The number of atoms to match has to be the same for both structures.

What's strange is I've had no problem running this aligner with none-matching structures, so I'm not sure why it's throwing it for these files in particular. Any suggestions would be welcome! Thanks, Noah