dnanexus-rnd / GLnexus

Scalable gVCF merging and joint variant calling for population sequencing projects
Apache License 2.0
142 stars 37 forks source link

tech debt: unaligned capnproto messages from rocksdb #221

Open mlin opened 4 years ago

mlin commented 4 years ago

BCFKeyValueData works with capnp messages stored in RocksDB table blocks, which don't provide word alignment for the tightly packed values. Non-aligned reads have been disabled in capnp 0.8.0 (https://github.com/capnproto/capnproto/pull/977) for what sound like good reasons. Reinvestigate how much to worry about this from the links below and concerns described in the capnp diff. If necessary, memcpy the RocksDB values into an aligned buffer before opening the capnp reader on them; which would be a bummer, but probably not really a large cost compared to the binary search and parsing we then undertake within.

https://github.com/dnanexus-rnd/GLnexus/blob/4d057dcf24b68b33de7a9759ae65ca2b144a3d47/src/BCFKeyValueData.cc#L459-L474