PDB-REDO / dssp

Application to assign secondary structure to proteins
BSD 2-Clause "Simplified" License
166 stars 19 forks source link

Question about cif output. #69

Closed shuuul closed 1 year ago

shuuul commented 1 year ago

I am using CIF output of DSSP. The PDB entry is 7UCK. https://www.rcsb.org/structure/7UCK

I try to read the CIF file using Biopython or Biostructures.jl, the list of chain ids are both 1,2,5,7,8,9,A,B,C,D,E,F,G,H,I,J,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,r,AA,Aa,BB,Bb,CC,Cc,DD,Dd,EE,Ee,FF,GG,Gg,HH,II,JJ,KK,LL,NN,OO,PP,QQ,RR,SS,TT,UU,VV,WW,XX,YY,ZZ.

However, in the dssp_output, the _dssp_struct_summary.label_asym_id values are G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z, AA, BA, CA, DA, EA, FA, GA, HA, IA, JA, KA, LA, MA, NA, OA, PA, QA, RA, TA, UA, VA, WA, XA, YA, ZA, AB, BB, CB, DB, EB, FB, GB, HB, IB, JB, KB, LB, MB, NB, OB, PB, QB, RB, SB, TB, UB, VB, WB, XB, YB, ZB, AC.

How could I map the results to the structure in Biopython?

drlemmus commented 1 year ago

Biopython seems to be using auth_asym_id whereas DSSP uses label_asym_id. Perhaps you can make biopython use the "label-system"

shuuul commented 1 year ago

Thanks for your reply! I will write a dict to translate the labels.