Partial results using haveibeenpwned search command

BSnow52 commented 3 years ago

Hi hRun,

I deployed the add on in my splunk environment but I'm having some issue getting it to work as expected. When using the search command while passing email addresses as results I only get ouput on a few of the results. This only applies when using "mode=mail". When using "mode=domain" I receive breach results for all of the results. Reviewing the search log I see that when using mode=domain the input and output values match and seem to not have much of a limit, I've been able to get results on 1,000 unique email addresses (though I know this applies to the domain rather than the actual email address). Running the same exact search and only changing "mode=mail" I see the inputs is a fraction of the actual results and the outputs seems to max out at 3.

I'm hoping you can tell me if this is by design or maybe if I'm just using the add-on incorrectly. The idea was to run this command to check a large number of results for breach information on a large number of email addresses at one time. The search syntax I'm using is similar to this:

I'm attaching a screenshot of what the job inspection looks like on an unsuccessful run. The number of events or emails that this should have returned values for is 1,181

Thanks in advance for any help provided!

hRun commented 3 years ago

Hi BSnow52,

Thank you for the feedback on your experience with the add-on so far. It's very much appreciated, as I was hoping to receive more feedback like this than I actually have after the add-on's last update.

You're absolutely right about your usage and the expected outcome. Assuming there are no issues in HIBP's source data this is certainly not expected behaviour and I'll look into this. You'll have to bear with me a couple of days though until I find the time to thoroughly investigate. Please ping me again if you didn't hear from me or receive a fix by the end of next week.

Best Regards, hRun

hRun commented 3 years ago

Well, good news after all. It just so happened that I had some time earlier than expected.

Commit 68cc9ee fixes all issues I was able to produce through thorough testing. The issues you were experiencing were most likely either due to characters in your mail addresses which needed URL encoding before being submitted to the API or because you perform direct HTTP requests to the API. My testing environment is behind a proxy, which will cause the script to perform HTTP requests differently, so it's possible I missed some potential sources of errors when I first published the add-on. These should be gone now.

Would you do me the favor of testing with the newest version of the script from Github on your end, so I can close the issue and submit an updated version of the add-on to Splunkbase?

BSnow52 commented 3 years ago

Hi hRun,

Thank you for getting to this so quickly. I've been doing some testing today and have had much better results. The only issue I'm running into at this point is the following message:

"TypeError at "$SPLUNK_HOME/etc/apps/SA-haveibeenpwned/bin/haveibeenpwned.py", line 248 : strptime() argument 1 must be string, not None"

I've tested on a few different data sets and noticed that it seems to happen at different times leading me to believe this is some sort of formatting issue with the data being returned, but I haven't been able to find a specific cause on what trips this error message. I'll continue investigating from my side to see if I can get something more solid for you. Thanks again!

Best regards, BSnow52

hRun commented 3 years ago

Well, that is great news. Thank you for the much appreciated feedback.

I'm pretty sure to know where that last issue stems from. That'll be an easy and quick fix which I'll implement probably tomorrow. I'll close this issue right after that.

hRun / SA-haveibeenpwned

Partial results using haveibeenpwned search command #1