Enhance handling of invalid UTF-8 sequences in check_numeric_host function

bulwark-security / bulwark-community-ruleset

An open-source ruleset for Bulwark.

Apache License 2.0

1 stars 0 forks source link

Enhance handling of invalid UTF-8 sequences in check_numeric_host function #2

Open coderabbitai[bot] opened 6 months ago

coderabbitai[bot] commented 6 months ago

Following the discussion in PR https://github.com/bulwark-security/bulwark-community-ruleset/pull/1, there's a consensus on improving the function to gracefully handle invalid UTF-8 sequences. The '�' character used to replace invalid sequences should potentially be filtered out.

sporkmonger commented 6 months ago

As mentioned in #1, the from_utf8_lossy function should help filter out invalid unicode sequences in a way that's difficult to use as a detection bypass.