JacksonVD / PwnedPasswordsDLL

Open source solution to check prospective AD passwords against previously breached passwords
https://jacksonvd.com/
82 stars 17 forks source link

Total passwords calculated incorrectly #4

Open Markusnl opened 5 years ago

Markusnl commented 5 years ago

I think the totalPasswords integer is calculated wrongly. Its currently calculated as

// Get total passwords by dividing by SHA length + 2 - 1 int totalPasswords = (bytes / 42) - 1;

Makes sense, hashes are formatted like this. 000000005AD76BD555C1D6D771DE417A4B87E4B4:4 (first hash in SHA1 V5 ordered by hash HIBP dataset)

So i'm guessing the "+2" are for the ':' and the x amount of times a password has been seen. Herein lies the problem however, many hashes also have a 'x' > 9.

For example 00000000DD7F2A1C68A35673713783CA390C9E93:630 (third hash in SHA1 V5 ordered by hash HIBP dataset)

Over the many millions of passwords, this adds up quite quickly. According to the PwnedPasswordsDLL, there are 578 million passwords in the dataset, whilst there are only 551 million.

Im not sure yet what problems this could cause, but unlikely to be desirable.

Markusnl commented 5 years ago

Alright, I've gotten some testing in.

The totalPasswords are miscalculated but this does not affect the functioning of the program. The additional passwords are duplicates from adjacent ones which is not an issue for the binary search algorithm. I don't see this being much worse than potentially adding an additional lookup here and there.

Passwords with x > 9 are handled by the getRecord function, which may just be a happy coincidence

First you set the stream to the required position inFile.seekg(((long long)(position) * 42), inFile.beg);

And then skip all characters till a newline is found inFile.ignore((std::numeric_limits<std::streamsize>::max)(), '\n');

The latter here readjusts the position of the stream to the start of the next line. Fixing any x > 9, but also disallowing ever getting the first line of a file. Although you've already covered that case.

     // For whatever reason, the first line is skipped. This should stop that.
    if (upper == -1)
    {
        file.close();
        std::ifstream file(files[ii].c_str(), std::ios::binary);
        char buffer[41] = "";

        // Read 40 chararacters into the buffer string
        file.getline(buffer, 41);

        std::string buff(buffer);

        if (buff.compare(searchValue) == 0)
        {
            passwordNotFound = FALSE;
        }
    }

All in all the DLL works as intended. I'm not sure if this issue warrants a fix, as from the looks of it, it will require extensive re-engineering. Any thoughts?