Hi! We've been using utf8-cleaner for a bit and it's made a big difference in preventing our bug tracking services from being flooded, so thank you for sharing.
Unfortunately as soon as our older utf8 errors stopped rolling in we started getting a lot of these "string contains null byte" errors and utf8-cleaner isn't treating these as invalid strings. Our app is running Rails 5.2, Ruby 2.5.1, and utf8-cleaner 0.2.5.
I created a branch to add a check for this null character %00 to utf8-cleaner and would love to submit a Pull Request if you all would be interested (PR available here). It is rather basic and just adds another regex check for NULL_CHARS = /(%00)/ right after valid_uri_encoded_utf8 checks for INVALID_PERCENT_ENCODING_REGEX.
Reading the previous, still-open issue, I'd considered using a rescue_from as Leon suggested, but to his other point, I believe a fix for any null characters would be right in line with the main purpose of the gem; we're using utf8-cleaner to clean our incoming requests so we can at least handle/route them properly, even if they aren't properly formed or correct. That being said, I'm of course open to any feedback, suggestions, or constructive criticism.
Hi! We've been using utf8-cleaner for a bit and it's made a big difference in preventing our bug tracking services from being flooded, so thank you for sharing.
Unfortunately as soon as our older utf8 errors stopped rolling in we started getting a lot of these "string contains null byte" errors and utf8-cleaner isn't treating these as invalid strings. Our app is running Rails 5.2, Ruby 2.5.1, and utf8-cleaner 0.2.5.
I created a branch to add a check for this null character
%00
to utf8-cleaner and would love to submit a Pull Request if you all would be interested (PR available here). It is rather basic and just adds another regex check forNULL_CHARS = /(%00)/
right aftervalid_uri_encoded_utf8
checks forINVALID_PERCENT_ENCODING_REGEX
.Before changes:
After changes:
Reading the previous, still-open issue, I'd considered using a
rescue_from
as Leon suggested, but to his other point, I believe a fix for any null characters would be right in line with the main purpose of the gem; we're using utf8-cleaner to clean our incoming requests so we can at least handle/route them properly, even if they aren't properly formed or correct. That being said, I'm of course open to any feedback, suggestions, or constructive criticism.