mittinatten / freesasa

C-library for calculating Solvent Accessible Surface Areas
http://freesasa.github.io/
MIT License
105 stars 37 forks source link

Wrong SASA after setting different atom radius #41

Closed fabiotrovato closed 5 years ago

fabiotrovato commented 5 years ago

Hi,

I was testing the attached code, which takes in input a pdb file (one Calpha atom and nothing else) and sets its radius to 3.8. I get the wrong sasa value, corresponding to a smaller Calpha radius. Here it is the output from the script.

output: CA radius 3.8 probe radius 1.4 1 CA 135.194041618

I followed the instructions on https://freesasa.github.io/python/intro.html#basic-calculations. Can you please explain why such a simple case is not working as I would expect (the sasa should be 339.8)? I have attached everything so you can try to reproduce what I see.

Thank you, Fabio

freesasa_test.py.gz test.pdb.gz

fabiotrovato commented 5 years ago

A workaround that I have found is to pass a list of radii to freesasa.Structure. In this case I recover the correct sasa. Please see below:

cg_struc = freesasa.Structure("test.pdb") rad_list = [3.8 for i in range(cg_struc.nAtoms())] cg_struc.setRadii(rad_list) cg_result = freesasa.calc(cg_struc) print cg_result.atomArea(idx)

output: 339.794661412

fabiotrovato commented 5 years ago

Along the same lines of the atomic radii ...

I have issues also with freesasa.calcCoord(). In this case I am running a script on a one-residue pdb structure and ~ 350 residue pdb structure (they contain all atoms, not only Calphas). Since I want to use only Calphas, I input Calphas coordinates and radii as per freesasa.calcCoord() specs. What I observe is that for the 350 residue protein the sasa in underestimated. For the one-residue case the sasa is correct. How is this possible if I am using exactly the same code on both structures?

To confirm that the calculation on the 350 residue protein is weird, I have modified the pdb file of the 350 residue protein in order to store ONLY Calphas. Then I used the code I've posted in the second message. The sasa of each Calpha now looks better, as it is larger.

mittinatten commented 5 years ago

Hi Fabio, I won't be able to test anything myself until earliest tomorrow. Have you verified that your classifier's radius function is actually called?

Simon

fabiotrovato commented 5 years ago

Hi Simon,

The relevant part of my code is this (I had already defined cg_classifier similarly to the example in the manual instructions): structure = freesasa.Structure("test.pdb",cg_classifier) result = freesasa.calc(structure) If I print cg_classifier.radius('*',atomName='CA') I get the radius I want, so that seems correct. I thought that freesasa.Structure("test.pdb",cg_classifier) was sufficient to give the right results. If not, what should I do?

mittinatten commented 5 years ago

Ok, I found the problem. There is a mistake in the documentation, one line is missing in the example , that is necessary to make it use your custom classifier for assigning radii: You have to add the line purePython = true to your classifier, i.e.

class CG_Calpha_Classifier(Classifier):
    purePython = True
    def classify(self, residueName, atomName):
        ...

It's mentioned in the docs for the Classifier class itself, http://freesasa.github.io/python/classes.html#classifier, but I will update the documentation to reflect this.

fabiotrovato commented 5 years ago

That is correct. I was going through the Classifier class and saw this. Thanks, now it works. There are a few minor things you might want to consider in the example at https://freesasa.github.io/python/intro.html. (1) Indentation of the last return of DerivedClassifier. (2) Argument of DerivedClassifier should be freesasa.Classifier or one should add "from freesasa import Classifier" at the beginning.

Cheers, Fabio

mittinatten commented 5 years ago

Thanks! I have updated the docs with your suggestions and the purePython correction, should show up shortly.

fabiotrovato commented 5 years ago

No problem. I would appreciate if you could advice also on freesasa.calcCoord() problem (third post). Given a pdb structure, I would like to compute the SASA of my Calpha-only representation (which I obtain with mdtraj) without the need of (i) writing the Calphas to disk and (ii) then loading the file into a Structure class.

Thank you.

mittinatten commented 5 years ago

If I understand you correctly you are performing calculations on a full PDB file but you only want to use the C-alphas. If you're file contains other atoms these will be included even if you set their radius to 0. The probe radius will be added, and therefore their SASA will be non-zero. So you would have to construct a structure that has only CA as you describe.

A solution could be initiate a structure from the original PDB, filter the structure by looping over the atoms and only add the coordinates and radii of C-alphas to new arrays which are then passed to freesasa.calcCoord().

fabiotrovato commented 5 years ago

Yes, that is what I was trying to say. Using mdtraj I read in the original PDB, then extract only the Calpha coordinates in an array (coord) and format them as required by calcCoord(). Calling freesasa.calcCoord(coord, rad_list), where rad_list is a list of radii that I define, does not work, I get the wrong sasa. I do not know what I am doing wrong (or misunderstanding)

rad_list = [3.8 for i in range(nca)] #nca is the number of Calphas coord = ca_trace.xyz[0] #from mdtraj. coord is a numpy array of shape (N,3) coord = coord.flatten() #required by calcCoord() result = freesasa.calcCoord(coord,rad_list) for idx in range(nca): sasa = result.atomArea(idx) #this sasa has the wrong value

mittinatten commented 5 years ago

Have you checked that flatten flattens the right way? I.e. do you get x, y, z, x, y, z, ... or x, x, x, ..., y, y, y, ...? On Thu, 22 Nov 2018 at 19:48, fabiotrovato notifications@github.com wrote:

Yes, that is what I was trying to say. Using mdtraj I read in the original PDB, then extract only the Calpha coordinates in an array (coord) and format them as required by calcCoord(). Calling freesasa.calcCoord(coord, rad_list), where rad_list is a list of radii that I define, does not work, I get the wrong sasa. I do not know what I am doing wrong (or misunderstanding)

rad_list = [3.8 for i in range(nca)] #nca is the number of Calphas coord = ca_trace.xyz[0] #from mdtraj. coord is a numpy array of shape (N,3) coord = coord.flatten() #required by calcCoord() result = freesasa.calcCoord(coord,rad_list) for idx in range(nca): sasa = result.atomArea(idx) #this sasa has the wrong value

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/mittinatten/freesasa/issues/41#issuecomment-441103116, or mute the thread https://github.com/notifications/unsubscribe-auth/ACteHBERhyfY3W11tCPZjYtAXoNX7nFDks5uxvGLgaJpZM4YvGRQ .

fabiotrovato commented 5 years ago

Yes, I did check this at the very beginning. Here it is again the output of a quick test:

a = np.array([[0,1,2],[3,4,5]]) #this mock arrau has the same mdtraj format i.e. [[x0,y0,z0],[x1,y1,z1]] a array([[0, 1, 2], [3, 4, 5]]) a.shape (2, 3) a.flatten() array([0, 1, 2, 3, 4, 5])

fabiotrovato commented 5 years ago

Let me give you some additional clues.

1) If I run my code on a two-residue pdb structure (two residues put very far apart), and feed the flattened array into calcCoord(), I get exactly the right sasa for each Calpha, i.e. 339.8. 2) If I use the same exact code on a true protein (1ndy) I get what I think are smaller sasa. How do I know that they are smaller? I know because I tried the "safe approach" i.e. creating a structure object from a modified pdb, the one with only Calphas, and the sasa values are 2 to 3 times larger, compatible with the fact that the radii that I define seem to be ignored. (Btw, I would like to avoid the "safe approach" because I feel it's slower than passing directly an array of coordinates to calcCoord()).

I do not understand if this has to do with the size of the coordinate arrays or else. Let me know if you want me to send the script and files to reproduce 1 and 2.

Best, Fabio

fabiotrovato commented 5 years ago

Hi Simon,

so I found the problem. I was using mdtraj to read in and manipulate coordinates, but I did not convert from nm (mdtraj uses nm, in fact) to Angstrom. This explains the discrepancies between the sasa values when feeding to calcCoord() the coordinates obtained from mdtraj and the sasa values when using the "safe approach" of loading directly the pdb in a freesasa-structure object. So, calcCoord() is perfectly fine, I was mistaken by a conversion factor.

Sorry about that, Fabio

mittinatten commented 5 years ago

Ok, great to hear! Been away from computer. I'll close the issue then, just reopen if anything else comes up! And thanks for the input on documentation.

maryamrhm907 commented 1 year ago

hi, I have got this error while using prodigy. _Running Prodigy for structure gsk3b2_complexfit24 ERROR: [!] Error when running freesasa: [!] Error: Radius array is <= 0 for the residue: HIS ,atom: O1P

I have O2P and O3P atoms too how can I fix this problem?

mittinatten commented 1 year ago

Hi, that error message does not come from FreeSASA itself, and I have no knowledge of Prodigy, so not sure I can help you. HIS does not have any atoms labelled O1P, so my guess is there is probably something non-standard in your structure. (Also, please open a new issue when it's not directly related to the issue at hand, so the other people in this thread don't get notifications).