MEDIUM_LINE_BYTES is currently hardcorded in const.h, to a value of 8.
The hasmap & chunks chunks are then made in such way that if real medium length of lines is MEDIUM_LINE_BYTES, the hashmap will be filled by a factor defined by HMAP_LOAD_FACTOR (currently set to 0.5, for 50% hmap filling).
Therefore, we could read some random pages in the file (e.g: start/middle/end of file), and get a better guess of MEDIUM_LINE_BYTES from there.
It would greatly improve performance in wordlists with a lot of very long lines (for example, a list of md5).
Because if lines are 32bytes long, hmap will be filled 12.5% only (50%/2/2). And a lot more chunks are needed.
MEDIUM_LINE_BYTES
is currently hardcorded inconst.h
, to a value of 8. The hasmap & chunks chunks are then made in such way that if real medium length of lines isMEDIUM_LINE_BYTES
, the hashmap will be filled by a factor defined byHMAP_LOAD_FACTOR
(currently set to 0.5, for 50% hmap filling).Therefore, we could read some random pages in the file (e.g: start/middle/end of file), and get a better guess of
MEDIUM_LINE_BYTES
from there.It would greatly improve performance in wordlists with a lot of very long lines (for example, a list of md5). Because if lines are 32bytes long, hmap will be filled 12.5% only (50%/2/2). And a lot more chunks are needed.