HaveIBeenPwned / PwnedPasswordsDownloader

A tool to download all Pwned Passwords hash ranges and save them offline so they can be used without a dependency on the k-anonymity API
BSD 3-Clause "New" or "Revised" License
575 stars 41 forks source link

Can I see somewhere, if there is a new version? #29

Open jobifis opened 1 year ago

jobifis commented 1 year ago

Hi, can I see somewhere if there is a new version of password database before I download 30 GByte or above?

cnseubert commented 1 year ago

I would also like to be able to detect that new passwords were added before re-downloading and re-importing the entire 30GB blob of data. I supposed that it's likely that new data is added frequently, however if we could access this corpus with something like rsync which could only pull down new data instead of the whole shebang, that would certainly be a time savings.

cnseubert commented 1 year ago

you have closed this one and not answered it. Issue #31 is also closed and unanswered even though you've marked it a duplicate of this bug, so there is no clear answer to this issue?

FreifunkerEZ commented 1 year ago

This issue seems open to me.

jobifis commented 1 year ago

This issue also seems to be open to me.

Von: Chris @.> Gesendet: Samstag, 18. März 2023 11:17 An: HaveIBeenPwned/PwnedPasswordsDownloader @.> Cc: jobifis @.>; Author @.> Betreff: Re: [HaveIBeenPwned/PwnedPasswordsDownloader] Can I see somewhere, if there is a new version? (Issue #29)

This issue seems open to me.

— Reply to this email directly, view it on GitHub https://github.com/HaveIBeenPwned/PwnedPasswordsDownloader/issues/29#issuecomment-1474805038 , or unsubscribe https://github.com/notifications/unsubscribe-auth/A6AYK3B57BV5Y5HNMZ5EYS3W4WDSZANCNFSM6AAAAAAV2IYXBA . You are receiving this because you authored the thread. https://github.com/notifications/beacon/A6AYK3ANCSBEIOL6QYN5ZXLW4WDSZA5CNFSM6AAAAAAV2IYXBCWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTSX466S4.gif Message ID: @. @.> >

stebet commented 1 year ago

We don't currently have incremental updates. Hash ranges do return ETags but we haven't added support for that to the downloader yet.

eizedev commented 1 year ago

We are just in the middle of migrating our downloading tasks from the old cloudflare single version ( 7z) files to the api using this downloader. Thanks for that, great work!! As also already mentioned by @FreifunkerEZ in #31 we are also currently using the last-modified timestamp of the response header when requesting a specific range.

We would also be very happy to receive incremental updates.

@stebet or @troyhunt My question is, will the last-modified timestamp on a specific range also be updated on all others ranges, if a hash is added anywhere?

For example with powershell:

# specific range
$Url = 'https://api.pwnedpasswords.com/range/9A674'
# example date 30 days ago
$DBLastModifiedDate = (Get-Date).AddDays(-30)
$Request = Invoke-WebRequest $url
# get last-modified timestamp from response header as powershell datetime
$RemoteLastModifiedDate = [datetime][string]$Request.Headers.'Last-Modified'

# if last-modified timestamp is newer than the specified date start a new download
if ($RemoteLastModifiedDate -gt $DBLastModifiedDate) {
....
}

Thanks for your help

stebet commented 1 year ago

It should only update on the affected range. We also have ETags that you can use to detect changes.

eizedev commented 1 year ago

@stebet Thanks! So i could also use the http etag to detect the changes. Works like a charm, http status code 304 will returned if the etag does not have changed.
But, just to be sure, this etag is for current range only, right? So it is currently not possible to detect changes over all ranges/hashes?

Then currently for my purpose the only way would be to trigger the update on a regular basis, e.g. every 30 days because I have no way to detect general changes in the database, right? (of course, this is not an issue with the downloader)

image

stebet commented 1 year ago

But, just to be sure, this etag is for current range only, right?

Correct

flexxxxer commented 1 year ago

Same problem for me. Blindly downloading 30 GB without knowing if updates are available is unacceptable for me.

ezekielnewren commented 7 months ago

I think a new api endpoint is needed. e.g.

curl https://api.pwnedpasswords.com/count 00000:3000:51324 00001:5214:95743 00002:2045:13254 ... FFFFE:9534:62335 FFFFF:2945:98564

Where the first number is how many hashes contain that prefix and the second number is the sum of the occurrences of each hash.

oyeaussie commented 1 month ago

Hello All,

I have created a tool that does the same job using PHP: https://github.com/oyeaussie/PHPPwnedPasswordsDownloader

I hope someone finds it useful.

Thanks.

troyhunt commented 1 month ago

I have created a tool that does the same job using PHP: https://github.com/oyeaussie/PHPPwnedPasswordsDownloader

Nice one, gave you a shout-out here: https://twitter.com/troyhunt/status/1803413986785870323

oyeaussie commented 3 weeks ago

I have updated my downloader tool with a lot of options. You can now download, update, sort, cache, index hash files with the tool. I have also added a password lookup tool from CLI, which you can integrate into your PHP code. See the wiki page for more information: https://github.com/oyeaussie/PHPPwnedPasswordsDownloader/wiki/1.-Description

Also, regarding this issue, read this wiki page: https://github.com/oyeaussie/PHPPwnedPasswordsDownloader/wiki/9.-Update

I hope the tool helps someone. Cheers!