icanhasfay / PyPwned

A Python client for the HaveIBeenPwned REST API
MIT License
45 stars 12 forks source link

getAllBreachesForAccount() fails with JSONDecodeError when emails begins with some special characters #7

Closed RevolutionTech closed 7 years ago

RevolutionTech commented 7 years ago

The request made to haveibeenpwned directly concatenates the provided email to the rest of the URL: https://github.com/icanhasfay/PyPwned/blob/26295d2273262b040cbaffc4fe021aef2878be33/pypwned/__init__.py#L28

Usually this is fine, but if a special character such as # starts the email, this can cause some issues. In the case of #, a "Not Found" HTML webpage is returned (but with a 200 status code) and so pypwned tries to call .json() on this page, resulting in a JSONDecodeError.

I believe that you should be able to fix this problem either by disallowing emails that start with an invalid character (such as #) in the regex in getAllBreachesForAccount() or by passing the provided email through urllib.quote_plus.

icanhasfay commented 7 years ago

Hey @RevolutionTech, just ended up choosing a different email regex that seems to handle this case better so as not to send malformed emails over to haveibeenpwned.com. Let me know if this didn't work for any reason.