usnistgov / SP800-90B_EntropyAssessment

The SP800-90B_EntropyAssessment C++package implements the min-entropy assessment methods included in Special Publication 800-90B.
195 stars 88 forks source link

Why does the restart_main have different results in different systems #212

Closed MenolGone closed 1 year ago

MenolGone commented 1 year ago

I run the codes on windows 10 and centos 7.6 and tested the same file which is 1 bit per sample several times , and I get different X_cutoff value on average , the X_cutoff on windows is 631 and on centos is 608, it looks so confusing ... The only different between them is I changed the way the seed() function generates random numbers in utils.h ..The H_I is 0.89267708323411438

joshuaehill commented 1 year ago

A few notes that may be useful:

  1. One does expect some variation in the X_cutoff, because it is the result of a stochastic process. It is intended to be tested in a way that makes the X_cutoff value reasonably stable, but if the actual result is far into the tail, it may require more simulation rounds to get a good estimate of the correct value. The sort of variation that you are reporting is well outside what I would expect, so I'd guess that something else is going on.
  2. If you are working with a 1 million sample set, then 0.89 is a rather high assessment (the median assessment for pseudorandom data is on the order of 0.85 with that sample size), so you might expect a substantial range in X_cutoff, just because of how the statistics works. If this is the case, you may consider using a different dataset from the same noise source, and see what happens.
  3. I don't test on the Windows platform, and that platform isn't an officially supported platform. So far as I know, there is only anecdotal reports of the tool working on that platform, and I'm not aware of anyone who has seriously vetted this tool on the Windows platform. This may sound like an empty statement, but its important to realize that the codebase for this tool commonly presumes that a "long int" is 64 bit value. Windows is an LLP64 environment (rather than an LP64 environment) so any use of "long" in the code resolves to a 32-bit symbol rather than a 64-bit symbol. This can have surprising results, and may well yield situations where the code produces incorrect results in Windows. In order to make this tool work in windows, most uses of the long int types should be replaced with int64_t (or uint64_t for unsigned long int types). Without doing that, I'm not certain that the tool will work correctly in that environment.
MenolGone commented 1 year ago

Thanks a lot