Hello again!
I recently suffered from slow slow writing speed for 1.8Gb 2D-stacked tomography mrc file.
I figured out array writing part is the bottleneck so I resolve this situation by utilizing vectorzied operations.
For benchmark, I put @time to every single lines of write method and compare the benchmark to my vectozied code.
function Base.write(io::IO, d::MRCData; compress = :none)
@time newio = compressstream(io, compress)
@time h = header(d)
@time sz = write(newio, h)
@time sz += write(newio, extendedheader(d))
@time T = datatype(h)
@time data = parent(d)
@time fswap = bswapfromh(h.machst)
# @time write(newio,(fswap.(T.(data)))) ### this is my vectorized code that will replace iteration
@time begin
for i in eachindex(data)
@inbounds sz += write(newio, fswap(T(data[i])))
end
@time close(newio)
return sz
end
This is the test code
using MRC
file_name="HeLa.mrc"
for _ in 1:3
# read and write
orig_file = read(file_name,MRCData)
write("copy_$(file_name)",orig_file)
println("---- done writing ----")
# compare orig and copied
copied_file = read("copy_$(file_name)",MRCData)
println("writing correctness: $(copied_file == orig_file) \n")
end
Here is the benchmark for 1.8G 2D-stacked tomography mrc file. I listed last of the three repeated test results:
Hello again! I recently suffered from slow slow writing speed for 1.8Gb 2D-stacked tomography mrc file. I figured out array writing part is the bottleneck so I resolve this situation by utilizing vectorzied operations.
For benchmark, I put @time to every single lines of write method and compare the benchmark to my vectozied code.
This is the test code
Here is the benchmark for 1.8G 2D-stacked tomography mrc file. I listed last of the three repeated test results:
I also have tested with http://ftp.rcsb.org/pub/emdb/structures/EMD-5778/map/emd_5778.map.gz: