Following the discussion in PR https://github.com/bulwark-security/bulwark-community-ruleset/pull/1, there's a consensus on improving the function to gracefully handle invalid UTF-8 sequences. The '�' character used to replace invalid sequences should potentially be filtered out.
As mentioned in #1, the from_utf8_lossy function should help filter out invalid unicode sequences in a way that's difficult to use as a detection bypass.
Following the discussion in PR https://github.com/bulwark-security/bulwark-community-ruleset/pull/1, there's a consensus on improving the function to gracefully handle invalid UTF-8 sequences. The '�' character used to replace invalid sequences should potentially be filtered out.