I might be misunderstanding the purpose of the gem, but the README states
Re-encode it as UTF-8, replacing invalid and undefined characters as U+FFFD.
whereas it doesn't just replace invalid characters, but all UTF-8 characters. This is even more of a problem when sanitising null-bytes, as a single null-byte will wipe out all UTF-8 characters.
Expected: "Hello \xE0 World 😁" => "Hello � world 😁"
Actual: "Hello \xE0 World 😁" => "Hello � world ����"
I might be misunderstanding the purpose of the gem, but the README states
whereas it doesn't just replace invalid characters, but all UTF-8 characters. This is even more of a problem when sanitising null-bytes, as a single null-byte will wipe out all UTF-8 characters.
Expected: "Hello \xE0 World 😁" => "Hello � world 😁" Actual: "Hello \xE0 World 😁" => "Hello � world ����"