bcgsc / btl_bloomfilter

The BTL C/C++ Common bloom filters for bioinformatics projects, as well as any APIs created for other programming languages.
GNU General Public License v3.0
18 stars 4 forks source link

Add some sort of version check between hash function version changes #6

Closed JustinChu closed 5 years ago

JustinChu commented 5 years ago

@lcoombe actually had another version error thing occur today with my release of BBT 2.2.0 (which uses the new ntHash). She used an old filter that caused an unexpected results.

@benvvalk @TheSeaOfRhun what are your thoughts today about this? Is there anything that I'm overlooking. Also how would this be implimented? Would the we just add an integer as the last field of the header?

benvvalk commented 5 years ago

Yep, it's that simple... just a single integer field in the file header.

For an example, search for "version" in this source file (Konnector Bloom filter IO code):

https://github.com/bcgsc/abyss/blob/d4b4b5d3091d90a4967180d987bd7168dbf04585/Bloom/Bloom.h

The version check is implemented in the readHeader method:

https://github.com/bcgsc/abyss/blob/d4b4b5d3091d90a4967180d987bd7168dbf04585/Bloom/Bloom.h#L150-L164

JustinChu commented 5 years ago

Seening as it was present in a prototype version it makes me wonder why we didn't add it in the first place. I'll update BloomFilter.hpp and MIBloomFilter.hpp to have these starting at version 1.

@TheSeaOfRhun Can you add something similar to the counting bloom filter afterwards?

benvvalk commented 5 years ago

Thanks, @JustinChu!

JustinChu commented 5 years ago

@jowong4 Does the new counting bloom filter have a version number in the header? If so we can close this ticket.

jwcodee commented 5 years ago

No. Not at the moment. I will add add a version number variable to it soon, after I finish bug fixing the read and write filter functions

jwcodee commented 5 years ago

15

The above pull request includes version number variable in the header which is currently set to 2.

jwcodee commented 5 years ago

version is check now part of the magic header string