Dapid / tmscoring

Python implementation of the TMscore program
BSD 3-Clause "New" or "Revised" License
33 stars 12 forks source link

ValueError: operands could not be broadcast together with shapes (4,427) (4,432) - Different sequence legnth? #4

Closed gavieira closed 3 years ago

gavieira commented 3 years ago

Hi,

I've been trying to obtain rmsd values for two sequences, but have been getting this error:

ValueError: operands could not be broadcast together with shapes (4,427) (4,432)

The two pdb files have different sequence length. I assume that is the reason I'm getting this error, even though the TMscore web server still manages to compute RMSD/TM-scores for these sequences (available in pdbs.zip).

Any tips on how to run this module it using sequences with differing lenghts?

Thank you very much for the help and the program!

Dapid commented 3 years ago

By default, we use the index, which requires the same numbering. You can run it in "alignment" mode, where we do a sequence alignment:

alignment = tmscoring.TMscoring('2pa6_eno_arch.pdb', 'BAA81473.1.B99990001.pdb', mode='align')

That works, and gives a TM score of 0.647 and RMSD of 3.076.

Why is that mode not documented? I wonder the same thing, I must have missed it.

It would be reasonable to make indexing mode work for sequences of different size, keeping only the ones that match. That should be easy enough to add.

Geraldene commented 3 years ago

When running in 'alignment' mode I am getting the following error: for i, r in enumerate(structure1.get_residues()) File "<__array_function__ internals>", line 6, in hstack File "C:\Users\geral\AppData\Local\Programs\Python\Python37\lib\site-packages\numpy\core\shape_base.py", line 345, in hstack return _nx.concatenate(arrs, 1) File "<__array_function__ internals>", line 6, in concatenate ValueError: need at least one array to concatenate

Has this been encountered before, what can I do to resolve this issue? Thanks in advance!