treasure-data / chef-td-agent

Chef Cookbook for td-agent (Treasure Agent or Fluentd)
https://supermarket.chef.io/cookbooks/td-agent
Apache License 2.0
127 stars 121 forks source link

suppress update resource on why-run #126

Closed watanabe-kyohei closed 5 years ago

watanabe-kyohei commented 5 years ago

On why-run, these providers always detect updated resources. We regularly execute why-run to ensure idempotency, so want to avoid 'updated_by_last_action' when not neccessary to converge.

repeatedly commented 5 years ago

@yyuu No problem for this patch?

yyuu commented 5 years ago

For sure I do have problem merging PR that doesn't pass CI. Though it seems the failed CI jobs are failing in somewhere irrelevant to changes in this around td-agent's configurations 🤔 Anyway need someone's insight to promote this gets merged.

repeatedly commented 5 years ago

Hmm... tests failed with 3x-xxx series but the configuration from the log seems ok. I want to know tests are passed or not with master.

limitusus commented 5 years ago

First I could confirm it reproduces with master branch.

bundle exec kitchen verify 3x-chef12-centos-centos7

(snip)

       [2019-05-21T04:15:31+00:00] ERROR: service[td-agent] (td-agent::configure line 43) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1'
       ---- Begin output of /etc/init.d/td-agent restart || /etc/init.d/td-agent start ----
td-agent[FAILED]estarting td-agent: [FAILED]
td-agent[FAILED]td-agent: [FAILED]
       STDERR: /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/config/basic_parser.rb:92:in `parse_error!': expected end of line at test_in_tail_nginx.conf line 15,17 (Fluent::ConfigParseError)
        14:   pos_file /tmp/.access.log.pos
        15:   exclude_path ["/tmp/access.log.*.gz", "/tmp/access.log.*.bz2"]

            -----------------^
        16: </source>
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/config/v1_parser.rb:132:in `parse_element'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/config/v1_parser.rb:95:in `parse_element'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/config/v1_parser.rb:166:in `block in eval_include'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/config/v1_parser.rb:160:in `each'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/config/v1_parser.rb:160:in `eval_include'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/config/v1_parser.rb:145:in `parse_include'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/config/v1_parser.rb:104:in `parse_element'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/config/v1_parser.rb:43:in `parse!'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/config/v1_parser.rb:33:in `parse'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/config.rb:39:in `parse'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/supervisor.rb:259:in `load_config'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/supervisor.rb:630:in `block in supervise'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/serverengine-2.1.1/lib/serverengine/config_loader.rb:43:in `reload_config'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/serverengine-2.1.1/lib/serverengine/config_loader.rb:36:in `initialize'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/serverengine-2.1.1/lib/serverengine/daemon.rb:32:in `initialize'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/serverengine-2.1.1/lib/serverengine.rb:33:in `new'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/serverengine-2.1.1/lib/serverengine.rb:33:in `create'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/supervisor.rb:629:in `supervise'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/supervisor.rb:502:in `run_supervisor'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/command/fluentd.rb:310:in `<top (required)>'
        from /opt/td-agent/embedded/lib/ruby/site_ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require'
        from /opt/td-agent/embedded/lib/ruby/site_ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require'
        from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/bin/fluentd:8:in `<top (required)>'
        from /opt/td-agent/embedded/bin/fluentd:23:in `load'
        from /opt/td-agent/embedded/bin/fluentd:23:in `<top (required)>'
        from /usr/sbin/td-agent:7:in `load'
        from /usr/sbin/td-agent:7:in `<main>'

Then I took a little look with this patch

--- /tmp/v1_parser.rb   2019-05-21 04:21:39.134828129 +0000
+++ /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/config/v1_parser.rb    2019-05-21 04:22:09.846526032 +0000
@@ -127,7 +127,9 @@
                   end
                 end

+                p [k, @ss.pos, @ss.peek(50)]
                 v = parse_literal
+                p [k, @ss.pos, @ss.peek(50), v]
                 unless line_end
                   parse_error! "expected end of line"
                 end

and this is the execution result

as you can see, td-agent first successfully loads the configuration file and then, re-exec itself for daemonizing, ending up in the failure above. With the debugging logs I added, we can see most tokens are correctly taken and @ss.pos is increasing, except the failing line.

(first exec)

["exclude_path", 496, "[\"/tmp/access.log.*.gz\", \"/tmp/access.log.*.bz2\"]\n"]
["exclude_path", 545, "\n</source>\n", "[\"/tmp/access.log.*.gz\",\"/tmp/access.log.*.bz2\"]"]

(second exec)

["exclude_path", 496, "[\"/tmp/access.log.*.gz\", \"/tmp/access.log.*.bz2\"]\n"]
["exclude_path", 498, "/tmp/access.log.*.gz\", \"/tmp/access.log.*.bz2\"]\n</", "[]"]

Somehow the string [" (2 bytes) are taken by the parser. I'm not a professional in Ruby parser, but do you have any idea of what causes such parsing?

yyuu commented 5 years ago

It seems the regexp specified as format is making the config parser confusing.

https://github.com/treasure-data/chef-td-agent/blob/7320ded3b486d8f8305888ca762b9b64812c2bad/test/fixtures/smoke/recipes/default.rb#L74

To be honest, so far I have no idea what changes in the upstream project made the parser behavior different, but apparently the cause of CI failure of this project could be some bad test code. I gonna (try to)? make some short term remediation in #128

limitusus commented 5 years ago

@yyuu Thanks. So now the PR test should pass. Could you rerun it and go on?

yyuu commented 5 years ago

So now the PR test should pass. Could you rerun it and go on?

It should pass, if this topic branch has been rebased onto the latest master branch. Just retrying the build job on our side won't solve the issue, unfortunately.

limitusus commented 5 years ago

I rebased it onto the latest master and all tests passed 😄

limitusus commented 5 years ago

@yyuu Does it take time to ship a new version?

yyuu commented 5 years ago

Now 3.2.0 should be available on Chef Supermarket.