bcgsc / btl_bloomfilter

The BTL C/C++ Common bloom filters for bioinformatics projects, as well as any APIs created for other programming languages.
GNU General Public License v3.0
18 stars 4 forks source link

The magic string used in the code cause issues in gcc8 #7

Closed JustinChu closed 5 years ago

JustinChu commented 5 years ago

The issues are mostly caused by the lack of null termination of when we copy them into the struct that is serialized. Overall it seemed somewhat ugly for the magic string to be used in the first place. What should be done? Thoughts @benvvalk?

benvvalk commented 5 years ago

@JustinChu

Nothing wrong with having a magic string, but you could remove it if you wanted. As far as I understand, it's mostly so that the O/S can auto-detect the file type and open it with an appropriate application (e.g. double-clicking on a file in a file browser).

What is more valuable, in my opinion, is having an integer field for the format version (which we currently don't have). That way we could avoid confusion when our users open old Bloom filter files and nothing works.

It seems simplest to have the format version as a field separate from the magic string. Also, I liked your suggestion of bumping the format version number each time we update the version of ntHash in this repo (btl_bloomfilter). We could do something fancier later if we ever need to use alternative hash functions besides ntHash.

JustinChu commented 5 years ago

The issue was resolved by changing strncpy to memset.