mittinatten / freesasa

C-library for calculating Solvent Accessible Surface Areas
http://freesasa.github.io/
MIT License
105 stars 37 forks source link

Proper check for PDB format #6

Closed JoaoRodrigues closed 9 years ago

JoaoRodrigues commented 9 years ago

Another little issue: when reading a PDB file that has trimmed lines (shorter than 80 characters), which is actually quite common when dealing with processed PDBs from other programs, freesasa fails to read any atom info but attributes the class 'unknown' for the calculations. This makes things extremely slow and isn't easy to figure out at first. I'd suggest you add a small check during the PDB reading routine to see not if any atoms were read (that is there) but if any were read properly. Otherwise, just set the line length to a minimum with occupancy as the last field.

mittinatten commented 9 years ago

Wasn't aware of this. Do you have any example files?

mittinatten commented 9 years ago

I think it is fixed now, committed a patch to the dev branch.

JoaoRodrigues commented 9 years ago

Thanks! I can give you a test file for unit testing if necessary.

On Wed, Aug 12, 2015 at 11:37 AM, Simon Mitternacht notifications@github.com wrote:

I think it is fixed now, committed a patch to the dev branch.

Reply to this email directly or view it on GitHub: https://github.com/mittinatten/freesasa/issues/6#issuecomment-130252378

mittinatten commented 9 years ago

Yes that would be good, thanks!

JoaoRodrigues commented 9 years ago

Here.

Running it gives this output:

## freesasa 0.1.1 ##
name: 3BZD.pdb
algorithm: Shrake & Rupley
probe-radius: 1.400000 A
n_thread: 1
n_testpoint: 100
time_elapsed: 41.485172 s
n_atoms: 2754

Total:   69762.71 A2
Polar:       0.00 A2
Apolar:  69762.71 A2

Fixing the lines and re-running:

## freesasa 0.1.1 ##
name: 3BZD_fixed.pdb
algorithm: Shrake & Rupley
probe-radius: 1.400000 A
n_thread: 1
n_testpoint: 100
time_elapsed: 0.090440 s
n_atoms: 2754

Total:   16064.83 A2
Polar:    8599.18 A2
Apolar:   7465.65 A2
mittinatten commented 9 years ago

With my patch the calculation runs normally and gives the same result for the trimmed file you provided, and one that I "fixed" by adding whitespace at the end of each line. But I my results are, in both cases,

Total:   15955.55 A2
Polar:    6543.36 A2
Apolar:   9412.19 A2

Any chance a line was deleted or duplicated or something in your 3BZD_fixed.pdb? Or have you been customizing atomic radii maybe?

JoaoRodrigues commented 9 years ago

The PDB file is just padded so that each line has 80 characters.

I did a fresh cloning of your master branch and ran it on the same 2 structures. I get different (although not 0) values for the classes:

joao@jr-mbp:src$ ./freesasa ~/Downloads/3BZD.pdb
## freesasa 0.1.1 ##
name: /Users/joao/Downloads/3BZD.pdb
algorithm: Shrake & Rupley
probe-radius: 1.400000 A
n_thread: 1
n_testpoint: 100
time_elapsed: 36.049243 s
n_atoms: 2754

Total:   74753.07 A2
Polar:     963.12 A2
Apolar:  73789.95 A2
joao@jr-mbp:src$ ./freesasa ~/Downloads/3BZD_fixed.pdb
## freesasa 0.1.1 ##
name: …ao/Downloads/3BZD_fixed.pdb
algorithm: Shrake & Rupley
probe-radius: 1.400000 A
n_thread: 1
n_testpoint: 100
time_elapsed: 0.090429 s
n_atoms: 2754

Total:   15955.55 A2
Polar:    6543.36 A2
Apolar:   9412.19 A2
mittinatten commented 9 years ago

Ok, but then we get consistent results. I added your file to the unit tests. Thanks for pointing this out!

mittinatten commented 9 years ago

Will add it to the master branch within a few days. In the mean-time you can change the number 78 to 54 on line 66 in pdb.c if you don't want to use the dev branch.

JoaoRodrigues commented 9 years ago

Thanks!

mittinatten commented 9 years ago

Decided to add it as a minor release 0.1.2 in master.