douweschulte / pdbtbx

A library to open/edit/save (crystallographic) Protein Data Bank (PDB) and mmCIF files in Rust.
https://crates.io/crates/pdbtbx
MIT License
51 stars 15 forks source link

lazystatic for storing Residue and Chain ids #87

Closed DocKDE closed 2 years ago

DocKDE commented 2 years ago

Fixes #86 Fixes #81 I tried a different approach than previously. The idea of checking for membership in a set to determine whether the Residue or Chain Vec should be traversed remains the same. This time, however, I stored both sets in a global struct within a lazy_static. This is wrapped in a Mutex for thread-safe access and mutability. I chose this because storing this information in one of the existing structs would have led to problems with shared mutability and access, I think. I profiled this with large.pdb and it drastically reduces the time taken by the add_atom methods. Problems will arise whenever someone parses a PDB file, then removes full residues or chains and then adds atoms again. Like you said, the information needs to be kept in sync to prevent this but I'm not sure how to do that yet without massive overhead.

DocKDE commented 2 years ago

I pushed a change that should fix breakage resulting from mismatching structs, at least in the way I described earlier. I don't really understand how the tests and examples broke, though.

DocKDE commented 2 years ago

I don't understand what's going on with the breakage, somehow. The sphere example fails because no Atoms could be parsed from the 1ubq.pdb file but when I try the very same code in a separate crate and use this branch as dependency, it works. Also, when I run the tests as a whole, six of them fail, but when I run these tests separately by filtering out the rest, they work. Do you have any clue what's going on there?