tristanic / pae_to_domains

Graph-based community clustering approach to extract protein domains from a predicted aligned error matrix
MIT License
32 stars 7 forks source link

Add support for the new PAE JSON format #1

Closed Augustin-Zidek closed 2 years ago

Augustin-Zidek commented 2 years ago
tristanic commented 2 years ago

Thanks!

Will merge this... but can you tell me a little more about this?

and the predicted aligned error is rounded to integers

As written, that's a bit worrisome for a few different reasons. First is the 5-fold loss of precision (previously I believe it was reported in steps of 0.2 Angstroms, now just 1 A steps). Second, PAE values less than 0.5 would presumably round down to zero, causing potential divide-by-zero issues for any code that wants to use 1/(PAE). You could get the same level of compression without loss of precision or introduction of zeros by storing the data as int(round(pae*10))... is that a possibility?