logstash-plugins / logstash-filter-mutate

Apache License 2.0
16 stars 75 forks source link

Fix for Issue 44 - Lowercase fails when text is already all lowercase #45

Closed guyboertje closed 9 years ago

guyboertje commented 9 years ago

This bug was introduced by PR #43 Existing test coverage insufficient (J)Ruby returns nil if the value is already in desired case for upcase! and downcase! This nil value was stored back in the Event (J)Ruby returns the updated original object when a modify happens

closes #44

purbon commented 9 years ago

@guyboertje reviewing

purbon commented 9 years ago

I wonder about:

Note: case replacement is effective only in ASCII region.

does this make sense to you @guyboertje ? is this something we should be aware of?

This is from docs in http://ruby-doc.org/core-2.2.0/String.html#method-i-downcase-21

For the rest code LGTM and test pass, so :shipit: :shipit: :shipit: :shipit:

guyboertje commented 9 years ago

I am aware of the Ascii limitation. Looked at alternatives, tricky. There is an issue about other Unicode encodings not supported.

colinsurprenant commented 9 years ago

all Event strings are UTF-8. what is the behaviour of lowercase and uppercase on non-ascii UTF-8 strings? do we have tests/specs around these cases to assess the current behaviour?

guyboertje commented 9 years ago

@purbon @colinsurprenant we need to implement something along the lines of: https://github.com/rails/rails/blob/master/activesupport/lib/active_support/multibyte/unicode.rb

guyboertje commented 9 years ago

@colinsurprenant - If I add specs to show that we don't support multibyte case mutations, will that suffice?

purbon commented 9 years ago

I think this should not be a blocker to actually merge this change, I would recommend to open another issue in this plugin to track the multibyte case as @guyboertje commented at https://github.com/logstash-plugins/logstash-filter-mutate/pull/45#issuecomment-134784046 plus adding specs to check for this behaviour.

colinsurprenant commented 9 years ago

what I want to achieve is to have clear expectations through specs and doc of the behaviour of applying upper/lowercase on UTF-8 non-ascii strings. is that going to generate an error? is that going to ignore the unicode characters? what?

guyboertje commented 9 years ago

@colinsurprenant - is the new test what you had in mind?

elasticsearch-bot commented 9 years ago

Merged sucessfully into master!

colinsurprenant commented 9 years ago

why u no squash?

ph commented 9 years ago

17nj2m26py4pdjpg

guyboertje commented 9 years ago

Because I just wanted to get it done (i'm away tomorrow and monday), sorry, and it gave PH the chance to add a sad picture to this mail.

On Thu, Aug 27, 2015 at 4:15 PM Pier-Hugues Pellerin < notifications@github.com> wrote:

[image: 17nj2m26py4pdjpg] https://cloud.githubusercontent.com/assets/640/9524438/f73fadb8-4cac-11e5-8780-b78df9a338e2.jpg

— Reply to this email directly or view it on GitHub https://github.com/logstash-plugins/logstash-filter-mutate/pull/45#issuecomment-135465545 .