whitequark / rack-utf8_sanitizer

Rack::UTF8Sanitizer is a Rack middleware which cleans up invalid UTF8 characters in request URI and headers.
MIT License
315 stars 53 forks source link

#<Encoding::CompatibilityError: incompatible character encodings: UTF-8 and ASCII-8BIT> #54

Closed theghall closed 4 years ago

theghall commented 4 years ago

I get the following error: #. The input getting sanitized contains the extended ASCII character 127 (â).

I am using rack v2.2.2 and rack-utf8_sanitizer v1.7.0. It does not happen with rack-utf8_sanitizer v1.3.2

It happens here: utf8_sanitizer.rb:263:in `start_with?'

theghall commented 4 years ago

It may be related to upgrading to Ruby 2.7.0 as it does not appear to happen with 1.7.0 running under Ruby 2.6.5.

bf4 commented 4 years ago

it could be your rack version. there was a regression in rack related to encodings iirc

https://github.com/rack/rack/commit/8c62821f4a464858a6b6ca3c3966ec308d2bb53e

https://github.com/rack/rack/commit/8c62821f4a464858a6b6ca3c3966ec308d2bb53e#diff-4fc9bc1f7d91630f4f9f47fc6663f3f7R195

see https://github.com/rack/rack/pull/1486

theghall commented 4 years ago

it could be your rack version. there was a regression in rack related to encodings iirc

rack/rack@8c62821

rack/rack@8c62821#diff-4fc9bc1f7d91630f4f9f47fc6663f3f7R195

see rack/rack#1486

Sounds right. Even if I specify the charset as iso-8859-1, it still does not work.

bf4 commented 4 years ago

@theghall do you really need this lib? If you just want something to safely handle string conversions I have a gem for that.

It's been years since I used this.

theghall commented 4 years ago

I upgraded to ruby 2.7.1 and the issue appears to have been resolved.