vt-middleware / passay

Password policy enforcement for Java.
http://www.passay.org
Other
282 stars 64 forks source link

Add MemmoryMappedFileWordList. #48

Closed dfish3r closed 7 years ago

dfish3r commented 7 years ago

Improve FileWordList by using BufferReader in #readFile. Change the meaning of cachePercent to apply to file size rather than number of lines. (It's meaning was never well defined.) This allows the cache to be built inline while reading the file, removing the need to read the file twice. This will generally result in larger caches, but users can tune the cache size down if that is an issue.

dfish3r commented 7 years ago

See #47

dfish3r commented 7 years ago

Some statistics for a 672MB file with 1G max heap Java process:

WordList Cache Percent Init Time Search Time Heap Size
FileWordList 0% 18s 72s 224MB
FileWordList 1% 19s 2.9ms 367MB
FileWordList 5% 20s 1.3ms 519MB
FileWordList 10% 21s 0.9ms 625MB
FileWordList 15% 34s 1.1ms 822MB
MemoryMappedFileWordList 0% 5s 43s 224MB
MemoryMappedFileWordList 1% 6s 2.0ms 367MB
MemoryMappedFileWordList 5% 7s 0.7ms 514MB
MemoryMappedFileWordList 10% 8s 0.5ms 619MB
MemoryMappedFileWordList 15% 21s 0.4ms 819MB
dfish3r commented 7 years ago

@serac may have an outstanding issue with unicode characters. A different PR can be used if any further changes are needed.