Closed andrewlock closed 5 years ago
Hey Andrew,
Working on the database as that's what we're using, can't use files as we're writing a distributed application. The only issue I have is do we mae ef core a part ofthis or split the lib in to 3 projects (contracts, core and ef)? Regarding the reading of files I have an idea so will try to spike something tomorrow. Do you want me to fork, or are you happy to add me as a collaborator?
@SeanFarrow Just added you as a collaborator :)
I'd be interested to hear the exact use cases you have for this. In my mind I was thinking of the following:
IHaveIBeenPwnedService
implementation.From my point of view, I'm most interested in the last point. I would stand up a simple ASP.NET Core web app that mirrors the semantics of Troy's public API, but is just reading from a DB etc. This would work best from the distributed app point of view, and means the Validator service doesn't need knowledge of any particular API etc.
I think point 4. is prob also what you're after, though you're maybe describing 3.?
In terms of architecture, I think if you want to have EF specific code in there, it will have to be a separate package. A bit annoying as we end up with many different projects, but prob the only way to avoid escalating dependencies.
I'll try and put together the example mirror API in the next couple of days. I'll prob make it a simple middleware so that you can easily just drop it into any app, wire up an IHaveIBeenPwnedService
and have everything just work.
Also, RE your comment about the strategy pattern - that's pretty much just being achieved using DI atm. Just register the IHaveIBeenPwnedService
implementation (strategy) that you want, and it'll get injected into the Validator.
Currently looking at point 3, although 4 could be interesting! I’ve got several api’s so it’s another microservice to deploy, plus given we’re using identity we already have a database.
From: Andrew Lock [mailto:notifications@github.com] Sent: Monday, September 18, 2017 14:40 To: andrewlock/HaveIBeenPwnedValidator HaveIBeenPwnedValidator@noreply.github.com Cc: Sean Farrow sean.farrow@seanfarrow.co.uk; Mention mention@noreply.github.com Subject: Re: [andrewlock/HaveIBeenPwnedValidator] Loading from files is far too slow (#1)
@SeanFarrowhttps://github.com/seanfarrow Just added you as a collaborator :)
I'd be interested to hear the exact use cases you have for this. In my mind I was thinking of the following:
From my point of view, I'm most interested in the last point. I would stand up a simple ASP.NET Core web app that mirrors the semantics of Troy's public API, but is just reading from a DB etc. This would work best from the distributed app point of view, and means the Validator service doesn't need knowledge of any particular API etc.
I think point 4. is prob also what you're after, though you're maybe describing 3.?
In terms of architecture, I think if you want to have EF specific code in there, it will have to be a separate package. A bit annoying as we end up with many different projects, but prob the only way to avoid escalating dependencies.
I'll try and put together the example mirror API in the next couple of days. I'll prob make it a simple middleware so that you can easily just drop it into any app, wire up an IHaveIBeenPwnedService and have everything just work.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/andrewlock/HaveIBeenPwnedValidator/issues/1#issuecomment-330224279, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABY1ftSsY0VnV_i1TRn7Ym0qc-_O9wsGks5sjnKogaJpZM4PaT1e.
Also, could you not use File.ReadLines and concat all lines in all files using linq, surely this would be usable?
The problem is that the files are 12GB - you can't really read that all that into memory for a web app (I physically can't on my dev machine, only 16GB memory!). Having said that, the file approach clearly doesn't work with files that size either!
Ok, should we can that and just have an api with a database behind it? Either that or have a service as we have now with a strategy (which OOTB is a database). That way if people really want to implement the file based approach, with all its caveats, they can.
From: Andrew Lock [mailto:notifications@github.com] Sent: Monday, September 18, 2017 18:59 To: andrewlock/HaveIBeenPwnedValidator HaveIBeenPwnedValidator@noreply.github.com Cc: Sean Farrow sean.farrow@seanfarrow.co.uk; Mention mention@noreply.github.com Subject: Re: [andrewlock/HaveIBeenPwnedValidator] Loading from files is far too slow (#1)
The problem is that the files are 12GB - you can't really read that all that into memory for a web app (I physically can't on my dev machine, only 16GB memory!). Having said that, the file approach clearly doesn't work with files that size either!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/andrewlock/HaveIBeenPwnedValidator/issues/1#issuecomment-330306140, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABY1flsH78kfgvfdUnq9NOyeGDYetP6pks5sjq90gaJpZM4PaT1e.
Yeah, I think having the Validator use the API as the default option is best for the basic package. A separate package that implements an IHaveBeenPwnedApiService
with a database on top of that covers use case 3, and then a separate package for creating your own HaveIBeenPwned API (that can be backed by a DB too).
Ok, I’ll start work on the database. I’ll let you start work on the api.
From: Andrew Lock [mailto:notifications@github.com] Sent: Tuesday, September 19, 2017 19:30 To: andrewlock/HaveIBeenPwnedValidator HaveIBeenPwnedValidator@noreply.github.com Cc: Sean Farrow sean.farrow@seanfarrow.co.uk; Mention mention@noreply.github.com Subject: Re: [andrewlock/HaveIBeenPwnedValidator] Loading from files is far too slow (#1)
Yeah, I think having the Validator use the API as the default option is best for the basic package. A separate package that implements an IHaveBeenPwnedApiService with a database on top of that covers use case 3, and then a separate package for creating your own HaveIBeenPwned API (that can be backed by a DB too).
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/andrewlock/HaveIBeenPwnedValidator/issues/1#issuecomment-330629736, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABY1flfSPe2j9NCyQVn7uYBjoRv23LPtks5skAgXgaJpZM4PaT1e.
@SeanFarrow I've created a PR #4 to add an API server that mirrors the HaveIBeenPwnedAPI. I had to rearrange things a bit. Did you want to take a look before I merge?
Hi,
Yes, please, give me a couple of days, just reformatting a machine!
From: Andrew Lock [mailto:notifications@github.com] Sent: Sunday, October 01, 2017 19:38 To: andrewlock/HaveIBeenPwnedValidator HaveIBeenPwnedValidator@noreply.github.com Cc: Sean Farrow sean.farrow@seanfarrow.co.uk; Mention mention@noreply.github.com Subject: Re: [andrewlock/HaveIBeenPwnedValidator] Loading from files is far too slow (#1)
@SeanFarrowhttps://github.com/seanfarrow I've created a PR #4https://github.com/andrewlock/HaveIBeenPwnedValidator/pull/4 to add an API server that mirrors the HaveIBeenPwnedAPI. I had to rearrange things a bit. Did you want to take a look before I merge?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/andrewlock/HaveIBeenPwnedValidator/issues/1#issuecomment-333396973, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABY1fg854lNpC9Xb3Qmi94zq1fpsi15bks5sn9vwgaJpZM4PaT1e.
Hi @SeanFarrow, it's been a long while since we looked at this!
Not sure if you've kept up to speed with Troy's changes and the anonymous endpoint? If not, it means passwords are never sent to the API server, there's no rate limit, and the speed is incredibly quick (<20ms).
For me, that completely removes the need for the file/database APIs, and I'm thinking that will be true for most people. Frankly, I can't see the speed of a file-based service ever being sufficient, and the complexity of a database-based service being worth the (questionable) anonymity, performance or reliability gains.
I'm wondering how you feel about it? The project's been languishing here for a while, and updating to just wrap the API would definitely simplify things, and remove some of the issues. I'd like to push up a release to NuGet this weekend if possible
Hi,
I definitely agree. Also, given that 2.1 is out and the benefits that brings with polly and retries, it makes sense to just create an API client to wrap the api. I’ve been meaning to do this for a while, but am currently with a client and heading to Norway for NDC, so time is limited. I did look at writing some code to splitthe the file as troy does, but it took an age to run. Hope that clarifies things.
From: Andrew Lock notifications@github.com Sent: 03 June 2018 06:47 To: andrewlock/HaveIBeenPwnedValidator HaveIBeenPwnedValidator@noreply.github.com Cc: Sean Farrow sean.farrow@seanfarrow.co.uk; Mention mention@noreply.github.com Subject: Re: [andrewlock/HaveIBeenPwnedValidator] Loading from files is far too slow (#1)
Hi @SeanFarrowhttps://github.com/SeanFarrow, it's been a long while since we looked at this!
Not sure if you've kept up to speed with Troy's changes and the anonymous endpoint? If not, it means passwords are never sent to the API server, there's no rate limit, and the speed is incredibly quick (<20ms).
For me, that completely removes the need for the file/database APIs, and I'm thinking that will be true for most people. Frankly, I can't see the speed of a file-based service ever being sufficient, and the complexity of a database-based service being worth the (questionable) anonymity, performance or reliability gains.
I'm wondering how you feel about it? The project's been languishing here for a while, and updating to just wrap the API would definitely simplify things, and remove some of the issues. I'd like to push up a release to NuGet this weekend if possible
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/andrewlock/HaveIBeenPwnedValidator/issues/1#issuecomment-394138910, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABY1flznd49LOMXfmkwqeZGPCc1Bi5Thks5t43htgaJpZM4PaT1e.
Cool, that was exactly my thoughts too 👍 I'll put something together in the next couple of days. Hope you enjoy NDC, I'm sure it'll be excellent!
Put together a typed client for the PwnedPaswords api, and stripped out all the file/db side: #6
Ok, I’ll review tonight, we can then release tomorrow, hope this is OK?
From: Andrew Lock notifications@github.com Sent: 04 June 2018 15:14 To: andrewlock/HaveIBeenPwnedValidator HaveIBeenPwnedValidator@noreply.github.com Cc: Sean Farrow sean.farrow@seanfarrow.co.uk; Mention mention@noreply.github.com Subject: Re: [andrewlock/HaveIBeenPwnedValidator] Loading from files is far too slow (#1)
Put together a typed client for the PwnedPaswords api, and stripped out all the file/db side: #6https://github.com/andrewlock/HaveIBeenPwnedValidator/pull/6
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/andrewlock/HaveIBeenPwnedValidator/issues/1#issuecomment-394369294, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABY1frHHc1u92Zkx75ZRoo5MCm1tHbUfks5t5UCXgaJpZM4PaT1e.
Yeah sure, no rush 🙂
@AndrewLock, All looks good, so feel free to merge/release. I can't seem to find where this is used in the sample app though. There are a few spelling mistakes in comments, but they can be fixed after release--yes, I'm being picky.
It would be good to show the use of Polly in the sample app, again this can be done after the release.
Given that we query the API, should we close this?
Yeah, agreed, I'll close 👍
If you have the full 12GB file, then searching through the file takes far too long to be acceptable in a web environment.
Could look at parallelism, but I suspect that won't help the problem much. May have to enforce loading into memory (big memory issue there) or add a DB service (I like this idea as an option anyway).