shawnbrown / datatest

Tools for test driven data-wrangling and data validation.
Other
294 stars 13 forks source link

Optimize assertSubjectUnique() method. #13

Closed shawnbrown closed 5 years ago

shawnbrown commented 8 years ago

The assertDataUnique() method operates in-memory without optimizations (see issue #9).

As mentioned earlier (see comment), explore the idea of implementing a Bloom filter approach to solve larger-than-RAM testing for uniques.

shawnbrown commented 5 years ago

This has been resolved with the addition of the RequiredUnique class: f622b0e9