How much data should be fed into entropy assessment tools for an accurate min-entropy estimate?

This tool is an implementation of the estimators in NIST SP 800-90B. It is difficult to interpret the meaning of the output of these tools without reading this document in order to get some idea of what this tool is intended to accomplish.

This document (in Section 3.1.1) specifies that the sample size ($L$) should be at least 1 million. It also requires that this data sample must be "raw" output of the noise source (roughly equivalent in AIS-31 terms to somewhere in the range between "das random numbers" and "raw random numbers").

Most of the estimators include some sort of confidence interval calculation, whose width varies proportionally to roughly $1 \over \sqrt{L}$, so (barring some observed defect) using larger samples is likely to result in results that are both numerically larger and more stable across independent tests.

This tool cannot (indeed, it is not theoretically possible for a tool to) reliably estimate the min entropy for all noise sources. For example, imagine statistically assessing almost any reasonable PRNG.

In SP 800-90B, any estimate for min entropy must be based on an understanding of the system producing the numbers (i.e., black box entropy estimation isn't, in general, possible.) This design-based assessment is integrated as the $H_\text{submitter}$ value.

usnistgov / SP800-90B_EntropyAssessment

How much data should be fed into entropy assessment tools for an accurate min-entropy estimate? #223