douweschulte / pdbtbx

A library to open/edit/save (crystallographic) Protein Data Bank (PDB) and mmCIF files in Rust.
https://crates.io/crates/pdbtbx
MIT License
49 stars 12 forks source link

Panic while parsing #96

Closed bddap closed 2 years ago

bddap commented 2 years ago

While parsing the pdb file from https://www.rcsb.org/structure/2BTV

thread 'main' panicked at 'The given remark-type-number is not valid, see wwPDB v3.30 for valid remark-type-numbers', ~/.cargo/registry/src/github.com-1ecc6299db9ec823/pdbtbx-0.9.2/src/structs/pdb.rs:114:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

To reproduce:

pdbtbx::open_pdb_raw(
    std::io::BufReader::new(std::io::Cursor::new(include_bytes!("2btv.pdb"))),
    pdbtbx::Context::None,
    pdbtbx::StrictnessLevel::Loose,
);
bddap commented 2 years ago

https://github.com/douweschulte/pdbtbx/blob/c3dedf0a41f9d1d42492abd6d275c58870bd4a1d/src/structs/pdb.rs#L114

douweschulte commented 2 years ago

Thanks for raising the issue. I see that I missed the number 400 as a valid remark type number somehow. In the process I made the error message better in the future errors like this will look different and are printed with all other errors:

StrictWarning: Remark type number invalid
    ╷
446 │ REMARK 400 MOLECULE 2 HAS SEGID 2201, VP7Q MOLECULE 3 HAS SEGID 2301,
    ·        ───
    ╵
The remark-type-number is not valid, see wwPDB v3.30 for all valid numbers.

I also verified that there are no other issues with parsing this file. So you will be able to use this file for your work.

bddap commented 2 years ago

Awesome! The change both removes the panic and fixes parsing of that specific file. Thanks for getting to this so quickly.