MichaelGrafnetter / DSInternals

Directory Services Internals (DSInternals) PowerShell Module and Framework
https://www.dsinternals.com
MIT License
1.62k stars 250 forks source link

Bloomfilter for Test-PasswordQuality #146

Open PatchRequest opened 1 year ago

PatchRequest commented 1 year ago

For: Test-PasswordQuality

Instead of passing a 30GB file with all hashes a bloomfilter could be created from it and used to check against it. That would reduce the filesize to around 3GB and would be much faster and more efficent

I could implement such a feature would you be interested?

MichaelGrafnetter commented 1 year ago

Hi @PatchRequest , that sounds like a good idea! What would be the expected search time for 1 hash and 10K hashes when compared to the current binary search approach? What would the false positive rate be and should it be dealt with? What new paramater name of the Test-PasswordQuality cmdlet do you propose for this feature? -BlomFilterPath? And how would you like to name a cmdlet that would do the conversion? ConvertTo-BloomFilter?

aseigler commented 1 year ago

If time were an issue, I could see this being helpful as a sort of pre-filter. Bloom filter would be much faster to return not in set, and if it returned possibly in set, a follow up lookup in the larger database would drop FPR to zero. That's probably how I'd approach it in my use case. I'd be interested in testing out how much faster I could run this scenario against my dataset.

PatchRequest commented 1 year ago

I would use speed as a secondary argument i think size is more interesting because with a bloom filter the "bad password list" can fit on any usb stick with a false positive rate of 0.001%:

When benchmarking bloomfilters the nice think is they scale with O(1) while binary search is O(log N). Therefore the bigger the password list is the more efficent the bloom filter becomes. Which is a win-win situation

I think a parameter called -BlomFilterPath is a good idea, and the cmdlet for the creating it sounds good to. The only thing i would add is to provide an bloomfilter for haveibeenpwnd already with github lfs. So the bad password check is just a git clone -> Downloading 3 GB -> Lets go

MichaelGrafnetter commented 1 year ago

Sounds great. Regarding git lfs, I am a newbie here. Having issues with it, constantly getting download quota exceeded:

image

I used to store sample databases with git lfs, which was not a good idea. I am considering to do a cleanup and to upload my test ntds.dit files (several GBs) to Azure Blob Storage and to integrate their download into unit test runner.

PatchRequest commented 1 year ago

mmh an alternative could be to get it hosted somewhere else where there is no quota :/

PatchRequest commented 1 year ago

but anyways i will start to develop the features