MDAnalysis / mdanalysis

MDAnalysis is a Python library to analyze molecular dynamics simulations.
https://mdanalysis.org
Other
1.33k stars 653 forks source link

Read secondary structures #3790

Open richardjgowers opened 2 years ago

richardjgowers commented 2 years ago

Is your feature request related to a problem?

PDB files have quite a few fields, why not read this data?

Describe the solution you'd like

Atom attributes for secondary structure.

Describe alternatives you've considered

Additional context

IAlibay commented 2 years ago

Reading HELIX, etc.. and dumping them into a residue-wide attribute isn't too bad imho, to speed this discussion up, can we come up with quick solutions for what we do about:

  1. When they are partially missing
  2. When we write but the coordinates have been modified (i.e. should we be able to write out the secondary structure if it's different from what it's meant to be?)

That's the only two things I can think could be stalling points.

orbeckst commented 2 years ago
  1. set anything missing to None (or empty string "") so that it evaluates to False if necessary
  2. do nothing when anything changed — this is information from the original file; anything else would be in conjunction of, say, our own DSSP analysis
richardjgowers commented 2 years ago

I think not writing secondary structures back out probably solves a lot of headaches. It’s easier to consume a format than produce it.

On Mon, Aug 29, 2022 at 11:53, Oliver Beckstein @.***> wrote:

  1. set anything missing to None (or empty string "") so that it evaluates to False if necessary
  2. do nothing when anything changed — this is information from the original file; anything else would be in conjunction of, say, our own DSSP analysis

— Reply to this email directly, view it on GitHub https://github.com/MDAnalysis/mdanalysis/issues/3790#issuecomment-1230573582, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGSGB7FDU2JHRVQFR3JCCTV3TTHBANCNFSM57UTHQ2Q . You are receiving this because you authored the thread.Message ID: @.***>

IAlibay commented 2 years ago

I think not writing secondary structures back out probably solves a lot of headaches. It’s easier to consume a format than produce it. On Mon, Aug 29, 2022 at 11:53, Oliver Beckstein @.> wrote: 1. set anything missing to None (or empty string "") so that it evaluates to False if necessary 2. do nothing when anything changed — this is information from the original file; anything else would be in conjunction of, say, our own DSSP analysis — Reply to this email directly, view it on GitHub <#3790 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGSGB7FDU2JHRVQFR3JCCTV3TTHBANCNFSM57UTHQ2Q . You are receiving this because you authored the thread.Message ID: @.>

Yeah but I need it written out for a certain project we may be working on 🙃

orbeckst commented 2 years ago

It’s easier to consume a format than produce it.

I don't think that this is true in general .... PDB "format" I am looking at you ;-)

I hadn't considered writing, just what the Topology attr should contain. However, writing to PDB shouldn't be too hard (famous last words...).