Closed diamondspark closed 2 years ago
Hi @diamondspark this is somewhat intentional. The graph attribute (g.graph[“dssp_df”]
) is mostly intended to be used for traceability/record keeping. The actual features should be available as node attributes.
for n, d in g.nodes(data=True):
print(d.keys())
break
If you require a dataframe, there is a function for retrieving node features as a dataframe.
Do you have very strong feelings about this? I wonder what the best way to allow control of this is. Perhaps in the DSSPConfig
object we can have a parameter controlling whether or not to filter the dataframe.
With respect to applying DSSP to the unprocessed PDB - I think this is the correct thing to do. I don’t think it will run correctly on, for example, a CA-only structure.
Hey @diamondspark any comments on this? If not I will close.
Hi @a-r-j I haven't had time to get back to this but I think g.nodes(data=True) should solve my purpose. Thank you once again.
In the method Protein/features/nodes/dssp/add_dssp_df(); Biopython's DSSP calculation is invoked on downloaded, unprocessed PDB. The resulting DSSP dataframe sometimes has a different number of residues than the protein graph generated as in #98. I believe the same preprocessing steps that are performed in Protein/graph.py are needed in Protein/features/nodes/dssp/add_dssp_df()
E.g. PDB: 1utm, 2qrh
Thank you!