Closed bicycle1885 closed 4 years ago
@jgreener64, any thoughts? Is PDBx/mmCIF enough popular to support in BioJulia? I have no idea.
You're right that the PDB standard format has switched to mmCIF. In my experience PDB is still the preferred format for people in the field, though that could be due to slow uptake of the new format.
Writing a mmCIF parser for BioJulia has been on my wish list for a while but realistically I won't have time in the near future - obviously I would be keen to talk design and review code if someone else wanted to do it.
It's also worth mentioning the MMTF at this point, a new binary format supported by the PDB. I started a Julia encoder/decoder for it but didn't get round to finishing it.
Thank you, @jgreener64. I didn't know MMTF. It seems to be promising since text-based file formats are, yes, slow.
Anyway, I will take a look mmCIF further when I have time.
For the record, a mmCIF reader/writer is now implemented in https://github.com/BioJulia/BioStructures.jl.
mmCIF is implemented in BioStructures.jl, MMTF is in the works and PDBML is a "someday" feature. Any discussion on this can be continued at BioStructures.jl.
Thanks @jgreener64!
Since the Protein Data Bank (PDB) has switched the standard file format from PDB to mmCIF, it is desirable to support PDBx/mmCIF (and PDBML/XML). I couldn't find the formal description of the format but it seems to be simple judging from some example. If it is flat (it seems to be so), we can use Automa.jl to generate a parser for it. PDBML/XML is XML so I think it's easier to support it using EzXML.jl.