psipred / Merizo

Fast and accurate protein domain segmentation using Invariant Point Attention
GNU General Public License v3.0
23 stars 5 forks source link

Merizo's Methodology for Non-Domain Region Detection #7

Open TerkaSlan opened 5 days ago

TerkaSlan commented 5 days ago

Hi, thank you for publishing and open-sourcing your tool, I find it very valuable in my own research.

I have a question regarding the "non-domain regions (NDR)" computation which you detail in the "Methods" section of the paper, in which you state: "Residues were dichotomised into either NDR (class 0) or non-NDR (class 1) categories based on two criteria: (1) the residue plDDT is less than 60, and (2) the standard deviation of PAE values for the residue is less than 0.4".

I tried to implement these two rules, but I am confused about the second one. PAE values typically range from 0- ~30, std of <0.4 for any row/column of the matrix seems quite low and in fact would not filter out e.g. the separate long helix (residues ~580 - ~650) of C9JQI7 which you include as an example in Fig. 3d). This makes me wonder if you're using some normalized version of PAE, or whether I'm missing something.

Thanks in advance. Best, Terezia