BioJulia / BioStructures.jl

A Julia package to read, write and manipulate macromolecular structures
Other
95 stars 22 forks source link

Add ProteinStructure constructors from MMCIFDict and MMTFDict #26

Closed marcom closed 4 years ago

marcom commented 4 years ago

Add ProteinStructure constructors from MMCIFDict and MMTFDict

This adds two new constructors

ProteinStructure(::MMCIFDict; kwargs...)
ProteinStructure(::MMTFDict; kwargs...)

Having these functions allows one to avoid having to parse a mmCIF or MMTF file twice in the case where one wants both the dictionary and the ProteinStructure with the coordinates.

I imagine there is a small performance benefit as well as reduced disk traffic (which can be slow on a networked file system).

Not sure if adding an additional constructor is the right way to do this, but that way i didn't have to introduce new function names.

Example

# Example for the mmCIF case
# before
mmcif_dict = MMCIFDict("path/to/file.cif")
struc = read("path/to/file.cif", MMCIF)

# after:
mmcif_dict = MMCIFDict("path/to/file.cif")
struc = ProteinStructure(mmcif_dict)

Benchmark on 4v4g (a rather extreme example)

julia> @btime (cif = MMCIFDict("4v4g.cif.gz"; gzip=true); struc = read("4v4g.cif.gz", MMCIF; gzip=true));
  8.893 s (49723548 allocations: 3.70 GiB)

julia> @btime (cif = MMCIFDict("4v4g.cif.gz"; gzip=true); struc = ProteinStructure(cif));
  5.762 s (28749625 allocations: 2.28 GiB)
codecov[bot] commented 4 years ago

Codecov Report

Merging #26 into master will increase coverage by 0.01%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #26      +/-   ##
==========================================
+ Coverage   94.60%   94.61%   +0.01%     
==========================================
  Files           6        6              
  Lines        1520     1524       +4     
==========================================
+ Hits         1438     1442       +4     
  Misses         82       82              
Impacted Files Coverage Δ
src/mmcif.jl 98.82% <100.00%> (+<0.01%) :arrow_up:
src/mmtf.jl 99.25% <100.00%> (+0.01%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 996de9a...d11aa37. Read the comment docs.

jgreener64 commented 4 years ago

Looks good, thanks.