bcgsc / btl_bloomfilter

The BTL C/C++ Common bloom filters for bioinformatics projects, as well as any APIs created for other programming languages.
GNU General Public License v3.0
18 stars 4 forks source link

Bit Vector Implementation #24

Closed jwcodee closed 5 years ago

jwcodee commented 5 years ago

This is not done yet. Right now it is only a 2 bit implementation. I would like some review on my code, and if you have suggestions for what other unit tests to add Things to do:

vlad0x00 commented 5 years ago

For unit tests, I could recommend adding one that tests for no presence. Since you might get false positives, ideally, this test would check a number of non-present elements and find no more than ~FPR hits (the limit should be somewhat larger than FPR for rare occurrences, but not too much).

jwcodee commented 5 years ago

For unit tests, I could recommend adding one that tests for no presence. Since you might get false positives, ideally, this test would check a number of non-present elements and find no more than ~FPR hits (the limit should be somewhat larger than FPR for rare occurrences, but not too much).

This is just the unit test for the bit vector itself not the Counting Bloom Filter using the BitVector class so I don't think there is any false positives involved.

vlad0x00 commented 5 years ago

For unit tests, I could recommend adding one that tests for no presence. Since you might get false positives, ideally, this test would check a number of non-present elements and find no more than ~FPR hits (the limit should be somewhat larger than FPR for rare occurrences, but not too much).

This is just the unit test for the bit vector itself not the Counting Bloom Filter using the BitVector class so I don't think there is any false positives involved.

Ah, I see. In that case, adding a no-presence test would be useful. From what I see, there are only tests that verify the presence of added elements.

jwcodee commented 5 years ago

For unit tests, I could recommend adding one that tests for no presence. Since you might get false positives, ideally, this test would check a number of non-present elements and find no more than ~FPR hits (the limit should be somewhat larger than FPR for rare occurrences, but not too much).

This is just the unit test for the bit vector itself not the Counting Bloom Filter using the BitVector class so I don't think there is any false positives involved.

Ah, I see. In that case, adding a no-presence test would be useful. From what I see, there are only tests that verify the presence of added elements.

You mean something like iterate through the vector and make sure bits that I didn't set are all 0s?

vlad0x00 commented 5 years ago

For unit tests, I could recommend adding one that tests for no presence. Since you might get false positives, ideally, this test would check a number of non-present elements and find no more than ~FPR hits (the limit should be somewhat larger than FPR for rare occurrences, but not too much).

This is just the unit test for the bit vector itself not the Counting Bloom Filter using the BitVector class so I don't think there is any false positives involved.

Ah, I see. In that case, adding a no-presence test would be useful. From what I see, there are only tests that verify the presence of added elements.

You mean something like iterate through the vector and make sure bits that I didn't set are all 0s?

Yep.

sjackman commented 5 years ago

Drop T altogether and just use uint64_t

It makes sense to drop it as a template parameter, but it should be kept as an internal private typedef.

jwcodee commented 5 years ago

Drop T altogether and just use uint64_t

It makes sense to drop it as a template parameter, but it should be kept as an internal private typedef.

You mean typedef uint64_t T?

sjackman commented 5 years ago

What's the status of this PR? Any work left to do, or is it good to go?

jwcodee commented 5 years ago

What's the status of this PR? Any work left to do, or is it good to go?

It's done in that I think I have everything I need to implement the CountingBloomFilter with the BitVector but it's not done in that I don't have all the std::vector API set up. I can always add that later though.

sjackman commented 5 years ago

I don't have all the std::vector API set up

You need only implement as much of the std::vector API as is necessary for ABySS. No need to implement the entire API, and please don't.