douweschulte / pdbtbx

A library to open/edit/save (crystallographic) Protein Data Bank (PDB) and mmCIF files in Rust.
https://crates.io/crates/pdbtbx
MIT License
49 stars 12 forks source link

Refactored to using HashMap while parsing #81 #88

Closed douweschulte closed 2 years ago

douweschulte commented 2 years ago

@DocKDE Here I refactored the PDB parsing to use HashMaps internally which are turned into the appropriate collection after parsing. The major problem is that HashMaps do not retain ordering so I had to add PDB::full_sort in the parsing routine to keep all tests running. Potentially this could be solved by using IndexMap. I am very interested in your view on this approach.

DocKDE commented 2 years ago

Thanks! I'll wait for the tests to run and then merge.

douweschulte commented 2 years ago

Fine with me. It was nice to see how the code evolved during the past days and different PRs. And I do think we have a very nice solution with a huge performance potential.

DocKDE commented 2 years ago

Fine with me. It was nice to see how the code evolved during the past days and different PRs. And I do think we have a very nice solution with a huge performance potential.

Agreed. I'm still wondering how to get parsing of that 1HTQ monster to acceptable timings but... eh. Also, it won't parse properly anyway because the validation throws an error :D