jordansissel / ruby-grok

Pure-ruby implementation of grok.
Apache License 2.0
54 stars 25 forks source link

Dead code and incorrect coerce function when duplicate capture keys in patterns #29

Open micoq opened 7 years ago

micoq commented 7 years ago

Hi,

In the compile_capture_func, the if coerce statement is always true even if no coerce function is used because the case call always return a string (even an empty one):

coerce = case coerce
                   when "int"; ".to_i"
                   when "float"; ".to_f"
                   else; ""
                 end
[...]
if coerce
[...]

In addition, if you have multiple capture patterns with a same name and a coerce function, only the first will be used. e.g. FIRST %{NUMBER:myfield:int} SECOND %{NUMBER:myfield:int} EXPR a%{FIRST}|b%{SECOND}

A match against the string "a42" (or "b42") will generate a string instead of a integer because the first coerce function will be erased by the second in the indices.each loop at the first iteration:

indices.each do |index|
coerce = case coerce
                   when "int"; ".to_i"
[...]

Here is a patch to fix this issues:

private
  # compiles the captures lambda so runtime match can be optimized
  def compile_captures_func(re)
    re_match = ["lambda do |match, &block|"]
    re.named_captures.each do |name, indices|
      pattern, name, coerce = name.split(":")
      coerce = case coerce
                   when "int"; ".to_i"
                   when "float"; ".to_f"
                   else; nil
                 end
      name = pattern if name.nil?
      indices.each do |index|
        if coerce
          re_match << "  m = match[#{index}]"
          re_match << "  block.call(#{name.inspect}, (m ? m#{coerce} : m))"
        else
          re_match << "  block.call(#{name.inspect}, match[#{index}])"
        end
      end
    end
    re_match << "end"
    return eval(re_match.join("\n"))
  end # def compile_captures_func