usnistgov / SP800-90B_EntropyAssessment

The SP800-90B_EntropyAssessment C++package implements the min-entropy assessment methods included in Special Publication 800-90B.
195 stars 88 forks source link

Large dataset causing core dump error on Tuple Estimates #228

Closed epelofske-LANL closed 9 months ago

epelofske-LANL commented 9 months ago

When running the non iid test on a very large bin datafile (2.4 Gb), the tests execute up until the Tuple Estimates where there is a core dump error:

Running Tuple Estimates...
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)
joshuaehill commented 9 months ago

The error that you are seeing suggests that you are running out of memory. The Tuple Estimates can require a substantial amount of memory (on the order of 25 times the number of symbols in the data set). Remember that the bitstring evaluation is going to generate a sample that has up to eight times the number of samples in your input data set, so for this data set size in the worst case (8-bit wide data) you're going to need more than 481 GB of RAM.

If this is indeed what is going on, the fix is "don't do that".

epelofske-LANL commented 9 months ago

Ah, makes sense. I was monitoring RAM consumption and it appeared that it had not hit saturation, so I thought this could be an unintended issue.