singlebrook / utf8-cleaner

MIT License
277 stars 44 forks source link

Remove null characters from utf8-cleaner fields. #36

Closed davidrouten closed 6 years ago

davidrouten commented 6 years ago

Lets check for %00 null byte characters; if found, consider string invalid.

sbleon commented 6 years ago

David, thanks so much for the PR, and thanks for the positive feedback! I'm really happy that this gem has been useful for you.

I was trying to reproduce the 500 error that you saw. I've been unable to get a Rails app (with utf8-cleaner installed) to crash based on a %00 in the URL. I Googled the error message ("string contains null byte"), and I think it's probably coming from the ActiveRecord PostgresSQL adapter. So, I think you're probably attempting to store the param with the null bytes in it in your database. If you inspect the stack trace associated with your error, you may be able to confirm or refute this.

Unfortunately, I'm not sure that it's within the scope of utf8-cleaner's responsibility to protect the database from input that the database considers invalid. Ruby itself, and the params-parsing code in Rails, have no problem with null characters in strings:

$ ruby -e 'puts URI.decode("foo%00bar")'
foobar

I'm not sure how to solve your problem in another way. You could add validations to your ActiveRecord models that look for null characters in string attributes, and then return a 400 error for a request with null chars.

See https://github.com/rails/rails/issues/26891 for some discussion of this issue in the Rails project.

davidrouten commented 6 years ago

Thanks Leon for the detailed response. Let me dig around some more and see if I can shed any more light on this issue!

davidrouten commented 6 years ago

Hi Leon, I appreciate the time you took to look into this and I think I have to agree that removing null byte chars isn't necessarily utf8-cleaner's job, especially since Ruby/Rails can typically handle them fine. Closing PR--thanks for the other suggestions.