HaveIBeenPwned / EmailAddressExtractor

A project to rapidly extract all email addresses from any files in a given path
BSD 3-Clause "New" or "Revised" License
64 stars 23 forks source link

A bit of cleanup here and there #70

Closed GStefanowich closed 8 months ago

GStefanowich commented 8 months ago

This may fix #69 (Nice)

Added in a few catches so that trying and failing to read a TLD cache is no longer a show stopper. It's meant to be there to prevent spamming calls to the IANA website. Now if a JsonException causes a failure, it'll simply make a request to IANA every run (And log a message).

Exceptions can happen from malformed JSON, or simply if your file system is set to read-only mode.

It currently already silently consumes the Request if the call to IANA fails outright by checking the return status:

if (response.StatusCode is HttpStatusCode.OK) {
    ...
}

Previously when this fails it would store an empty array of "allowed" TLDs, which would simply cause every domain given to it to fail.

So I've added a check for empty domains where it'll just allow all valid and non-valid TLDs and emit a warning

if (list.Count is 0) {
    this.EmptySetNotice.Dispose();
    return Result.CONTINUE;
}

Despite also being a common joke I've also added some Loose Json options to kind of just accept whatever BS is thrown at it:

this.Json = new JsonSerializerOptions {
    Encoder = JavaScriptEncoder.UnsafeRelaxedJsonEscaping,
    AllowTrailingCommas = true,
    ...
};