DataDog / fluent-plugin-datadog

Fluentd output plugin for Datadog: https://www.datadog.com
Apache License 2.0
39 stars 26 forks source link

Plugin install but reporting unknow #51

Closed andrescolodrero closed 2 years ago

andrescolodrero commented 2 years ago

Describe what happened: When reunning td-agent: 2021-12-16 15:16:10 +0000 [info]: parsing config file is succeeded path="/etc/td-agent/td-agent.conf" 2021-12-16 15:16:10 +0000 [info]: gem 'fluent-plugin-elasticsearch' version '5.1.4' 2021-12-16 15:16:10 +0000 [info]: gem 'fluent-plugin-flowcounter-simple' version '0.1.0' 2021-12-16 15:16:10 +0000 [info]: gem 'fluent-plugin-kafka' version '0.17.3' 2021-12-16 15:16:10 +0000 [info]: gem 'fluent-plugin-prometheus' version '2.0.2' 2021-12-16 15:16:10 +0000 [info]: gem 'fluent-plugin-prometheus_pushgateway' version '0.1.0' 2021-12-16 15:16:10 +0000 [info]: gem 'fluent-plugin-record-modifier' version '2.1.0' 2021-12-16 15:16:10 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '2.4.0' 2021-12-16 15:16:10 +0000 [info]: gem 'fluent-plugin-s3' version '1.6.1' 2021-12-16 15:16:10 +0000 [info]: gem 'fluent-plugin-sd-dns' version '0.1.0' 2021-12-16 15:16:10 +0000 [info]: gem 'fluent-plugin-systemd' version '1.0.5' 2021-12-16 15:16:10 +0000 [info]: gem 'fluent-plugin-td' version '1.1.0' 2021-12-16 15:16:10 +0000 [info]: gem 'fluent-plugin-utmpx' version '0.5.0' 2021-12-16 15:16:10 +0000 [info]: gem 'fluent-plugin-webhdfs' version '1.5.0' 2021-12-16 15:16:10 +0000 [info]: gem 'fluentd' version '1.14.3' 2021-12-16 15:16:10 +0000 [warn]: Status code 503 is going to be removed from default retryable_response_codes from fluentd v2. Please add it by yourself if you wish 2021-12-16 15:16:10 +0000 [error]: config error file="/etc/td-agent/td-agent.conf" error_class=Fluent::ConfigError error="Unknown output plugin 'datadog'. Run 'gem search -rd fluent-plugin' to find plugins"

Steps to reproduce the issue:

  1. apt get install td-agent (version 4)
  2. /usr/sbin/td-agent-gem install fluent-plugin-datadog
  3. systemctl start td-agent

i can verify that the plugin is on: /opt/td-agent/lib/ruby/gems/2.7.0/gems

Additional environment details (Operating System, Cloud provider, etc): UBuntu 20.04 try to reinstall

remeh commented 2 years ago

Hey @andrescolodrero, Sorry for the late reply. Were you able to solve your problem?

andrescolodrero commented 2 years ago

Thanks @remeh Yes, still same issue. I tried with oth td-agent and fluentd and got same results. I believe is more about some plugin issues or configuration, so i open ticket also here https://github.com/jfrog/log-analytics-datadog/issues/24

remeh commented 2 years ago

In the log you pasted in the issue opened here, I'm surprised to not see the fluent-plugin-datadog listed among the others on startup. You made sure it is present /opt/td-agent/lib/ruby/gems/2.7.0/gems but can you make sure permissions are fine? Using ls -l /opt/td-agent/lib/ruby/gems/2.7.0/gems, does permissions look similar between plugins? Also, we should make sure this is actually where you td-agent is looking for its gems. Let me know if you have updates on the other issue you've opened 👍

andrescolodrero commented 2 years ago

It is probably an old log @remeh

2022-02-04 09:31:28 +0000 [info]: parsing config file is succeeded path="artifactory.conf" 2022-02-04 09:31:28 +0000 [info]: gem 'fluent-plugin-datadog' version '0.14.1' 2022-02-04 09:31:28 +0000 [info]: gem 'fluent-plugin-datadog' version '0.12.1' 2022-02-04 09:31:28 +0000 [info]: gem 'fluent-plugin-elasticsearch' version '4.2.2' 2022-02-04 09:31:28 +0000 [info]: gem 'fluent-plugin-elasticsearch' version '4.0.9' 2022-02-04 09:31:28 +0000 [info]: gem 'fluent-plugin-jfrog-metrics' version '0.1.0' 2022-02-04 09:31:28 +0000 [info]: gem 'fluent-plugin-jfrog-siem' version '2.0.1' 2022-02-04 09:31:28 +0000 [info]: gem 'fluent-plugin-jfrog-siem' version '2.0.0' 2022-02-04 09:31:28 +0000 [info]: gem 'fluent-plugin-jfrog-siem' version '1.0.0' 2022-02-04 09:31:28 +0000 [info]: gem 'fluent-plugin-jfrog-siem' version '0.1.9' 2022-02-04 09:31:28 +0000 [info]: gem 'fluent-plugin-jfrog-siem' version '0.1.7' 2022-02-04 09:31:28 +0000 [info]: gem 'fluent-plugin-jfrog-siem' version '0.1.6' 2022-02-04 09:31:28 +0000 [info]: gem 'fluent-plugin-jfrog-siem' version '0.1.5' 2022-02-04 09:31:28 +0000 [info]: gem 'fluent-plugin-jfrog-siem' version '0.1.4' 2022-02-04 09:31:28 +0000 [info]: gem 'fluent-plugin-record-modifier' version '2.1.0' 2022-02-04 09:31:28 +0000 [info]: gem 'fluent-plugin-splunk-enterprise' version '0.10.2' 2022-02-04 09:31:28 +0000 [info]: gem 'fluentd' version '1.11.0' ... and after some seconds:

2022-02-04 09:32:09 +0000 [info]: #0 [access_service_tail] following tail of /opt/jfrog/artifactory/var/log/access-service.log 2022-02-04 09:32:09 +0000 [info]: #0 fluentd worker is now running worker=0 2022-02-04 09:32:30 +0000 [error]: #0 [datadog_agent_jfrog_artifactory] Uncaught processing exception in datadog forwarder execution expired 2022-02-04 09:32:30 +0000 [warn]: #0 [datadog_agent_jfrog_artifactory] buffer flush took longer time than slow_flush_log_threshold: elapsed_time=60.02107282727957 slow_flush_log_threshold=20.0 plugin_id="datadog_agent_jfrog_artifactory" 2022-02-04 09:32:30 +0000 [error]: #0 [datadog_agent_jfrog_artifactory] Uncaught processing exception in datadog forwarder execution expired 2022-02-04 09:32:30 +0000 [error]: #0 [datadog_agent_jfrog_artifactory] Uncaught processing exception in datadog forwarder execution expired 2022-02-04 09:32:30 +0000 [warn]: #0 [datadog_agent_jfrog_artifactory] buffer flush took longer time than slow_flush_log_threshold: elapsed_time=60.21483953483403 slow_flush_log_threshold=20.0 plugin_id="datadog_agent_jfrog_artifactory" 2022-02-04 09:32:30 +0000 [warn]: #0 [datadog_agent_jfrog_artifactory] buffer flush took longer time than slow_flush_log_threshold: elapsed_time=60.21288351062685 slow_flush_log_threshold=20.0 plugin_id="datadog_agent_jfrog_artifactory" 2022-02-04 09:33:29 +0000 [error]: #0 [datadog_agent_jfrog_artifactory] Uncaught processing exception in datadog forwarder Net::OpenTimeout 2022-02-04 09:33:29 +0000 [warn]: #0 [datadog_agent_jfrog_artifactory] buffer flush took longer time than slow_flush_log_threshold: elapsed_time=119.21889812871814 slow_flush_log_threshold=20.0 plugin_id="datadog_agent_jfrog_artifactory" 2022-02-04 09:33:31 +0000 [error]: #0 [datadog_agent_jfrog_artifactory] Uncaught processing exception in datadog forwarder execution expired 2022-02-04 09:33:31 +0000 [warn]: #0 [datadog_agent_jfrog_artifactory] buffer flush took longer time than slow_flush_log_threshold: elapsed_time=60.01581055857241 slow_flush_log_threshold=20.0 plugin_id="datadog_agent_jfrog_artifactory" 2022-02-04 09:33:31 +0000 [error]: #0 [datadog_agent_jfrog_artifactory] Uncaught processing exception in datadog forwarder execution expired 2022-02-04 09:33:31 +0000 [warn]: #0 [datadog_agent_jfrog_artifactory] buffer flush took longer time than slow_flush_log_threshold: elapsed_time=60.017407103441656 slow_flush_log_threshold=20.0 plugin_id="datadog_agent_jfrog_artifactory"

im trying to run your example configuration in the doc, but i dont understand what does it mean:

url -X POST -d 'json={"message":"hello Datadog from fluentd"}' http://localhost:8888/datadog.test

should i have something else running to do the test?

andrescolodrero commented 2 years ago

config:

@type tail @id access_service_tail path "#{ENV['JF_PRODUCT_DATA_INTERNAL']}/log/access-service.log" pos_file "#{ENV['JF_PRODUCT_DATA_INTERNAL']}/log/access-service.log.pos" tag jfrog.rt.access.service

@type none

@type tail @id artifactory_service_tail path "#{ENV['JF_PRODUCT_DATA_INTERNAL']}/log/artifactory-service.log" pos_file "#{ENV['JF_PRODUCT_DATA_INTERNAL']}/log/artifactory-service.log.pos" tag jfrog.rt.artifactory.service

@type none

.. ####################

DATADOG OUTPUT

#################### <match jfrog.**> @type datadog @id datadog_agent_jfrog_artifactory api_key XXXXXXXXXXXXXXXXXXXXXXXX

optional

include_tag_key true dd_source fluentd

@type memory flush_thread_count 4 flush_interval 3s chunk_limit_size 5m chunk_limit_records 500

remeh commented 2 years ago

2022-02-04 09:32:30 +0000 [error]: #0 [datadog_agent_jfrog_artifactory] Uncaught processing exception in datadog forwarder execution expired

This one is catching my eye, but I don't know what datadog_agent_jfrog_artifactory is and how does it work. I think you are right having opened an issue there as it looks that the issue is on this side.

andrescolodrero commented 2 years ago

Sorry, i didnt format the code. datadog_agent_jfrog_artifactory is just the ID

here the log @remeh

`2022-02-04 10:58:13 +0000 [info]: parsing config file is succeeded path="example.conf" 2022-02-04 10:58:13 +0000 [info]: gem 'fluent-plugin-datadog' version '0.14.1' 2022-02-04 10:58:13 +0000 [info]: gem 'fluent-plugin-datadog' version '0.12.1' 2022-02-04 10:58:13 +0000 [info]: gem 'fluent-plugin-elasticsearch' version '4.2.2' 2022-02-04 10:58:13 +0000 [info]: gem 'fluent-plugin-elasticsearch' version '4.0.9' 2022-02-04 10:58:13 +0000 [info]: gem 'fluent-plugin-jfrog-metrics' version '0.1.0' 2022-02-04 10:58:13 +0000 [info]: gem 'fluent-plugin-jfrog-siem' version '2.0.1' 2022-02-04 10:58:13 +0000 [info]: gem 'fluent-plugin-jfrog-siem' version '2.0.0' 2022-02-04 10:58:13 +0000 [info]: gem 'fluent-plugin-jfrog-siem' version '1.0.0' 2022-02-04 10:58:13 +0000 [info]: gem 'fluent-plugin-jfrog-siem' version '0.1.9' 2022-02-04 10:58:13 +0000 [info]: gem 'fluent-plugin-jfrog-siem' version '0.1.7' 2022-02-04 10:58:13 +0000 [info]: gem 'fluent-plugin-jfrog-siem' version '0.1.6' 2022-02-04 10:58:13 +0000 [info]: gem 'fluent-plugin-jfrog-siem' version '0.1.5' 2022-02-04 10:58:13 +0000 [info]: gem 'fluent-plugin-jfrog-siem' version '0.1.4' 2022-02-04 10:58:13 +0000 [info]: gem 'fluent-plugin-record-modifier' version '2.1.0' 2022-02-04 10:58:13 +0000 [info]: gem 'fluent-plugin-splunk-enterprise' version '0.10.2' 2022-02-04 10:58:13 +0000 [info]: gem 'fluentd' version '1.11.0' 2022-02-04 10:58:13 +0000 [info]: using configuration file:

@type tail
@id access_service_tail
path "/opt/jfrog/artifactory/var/log/access-service.log"
pos_file "/opt/jfrog/artifactory/var/log/access-service.log.pos"
tag "jfrog.rt.access.service"
<parse>
  @type "none"
  unmatched_lines
</parse>

@type tail
@id artifactory_service_tail
path "/opt/jfrog/artifactory/var/log/artifactory-service.log"
pos_file "/opt/jfrog/artifactory/var/log/artifactory-service.log.pos"
tag "jfrog.rt.artifactory.service"
<parse>
  @type "none"
  unmatched_lines
</parse>

<filter jfrog.**> @type record_transformer

hostname TisARTx21 log_source ${tag}

@type exec
tag "jfrog.callhome"
command "/opt/jfrog/artifactory/var/fluentd-1.11.0-linux-x86_64/lib/ruby/bin/gem list --local | grep fluent | sed \'s/ (/:/g\' | sed \'s/)//g\'  | sed \':a;N;$!ba;s/\n/\t/g\'"
run_interval 1d
<parse>
  @type "ltsv"
</parse>

@type record_transformer renew_record true keep_keys productId,features enable_ruby true productId jfrogLogAnalytics/v0.1.0 features ${return(record.map { |k,v| { "featureId" => (k + ':' + v).to_sym} })} @type record_transformer enable_ruby true repo ${record["request_url"].include?("/api/docker") && !record["request_url"].include?("/api/docker/null") && !record["request_url"].include?("/api/docker/v2") ? (record["request_url"].split('/')[3]) : ("")} image ${record["request_url"].include?("/api/docker") && !record["request_url"].include?("/api/docker/null") && !record["request_url"].include?("/api/docker/v2") ? (record["request_url"].split('/')[5]) : ("")} @type record_transformer enable_ruby true response_content_length_2 ${record["response_content_length"].to_f} request_content_length_2 ${record["request_content_length"].to_f} @type record_transformer enable_ruby true impacted_artifacts ${if record['repo_path'].length > 1; "default/" + record["repo_path"].split(':')[0] + "/" + record["repo_path"].split(':')[1].rstrip ; end;}

<match jfrog.**> @type datadog @id datadog_agent_jfrog_artifactory api_key xxxxxx include_tag_key true dd_source "fluentd"

@type "memory" flush_thread_count 4 flush_interval 3s chunk_limit_size 5m chunk_limit_records 500

2022-02-04 10:58:13 +0000 [info]: starting fluentd-1.11.0 pid=2814262 ruby="2.6.3" 2022-02-04 10:58:13 +0000 [info]: spawn command to main: cmdline=["/opt/jfrog/artifactory/var/fluentd-1.11.0-linux-x86_64/lib/ruby/bin/ruby", "-Eascii-8bit:ascii-8bit", "/opt/jfrog/artifactory/var/fluentd-1.11.0-linux-x86_64/lib/vendor/ruby/2.6.0/bin/fluentd", "-c", "example.conf", "--under-supervisor"] 2022-02-04 10:58:14 +0000 [info]: adding filter pattern="jfrog." type="record_transformer" 2022-02-04 10:58:14 +0000 [info]: adding filter pattern="jfrog.callhome" type="record_transformer" 2022-02-04 10:58:14 +0000 [info]: adding filter pattern="jfrog.rt.artifactory.request" type="record_transformer" 2022-02-04 10:58:14 +0000 [info]: adding filter pattern="jfrog.rt.artifactory.request" type="record_transformer" 2022-02-04 10:58:14 +0000 [info]: adding filter pattern="jfrog.rt.artifactory.access" type="record_transformer" 2022-02-04 10:58:14 +0000 [info]: adding match pattern="jfrog." type="datadog" 2022-02-04 10:58:14 +0000 [info]: adding source type="tail" 2022-02-04 10:58:14 +0000 [info]: adding source type="tail" 2022-02-04 10:58:14 +0000 [info]: adding source type="exec" 2022-02-04 10:58:14 +0000 [info]: #0 starting fluentd worker pid=2814271 ppid=2814262 worker=0 2022-02-04 10:58:14 +0000 [info]: #0 [datadog_agent_jfrog_artifactory] Starting HTTP connection to https://http-intake.logs.datadoghq.com:443 with compression enabled using v2 routes 2022-02-04 10:58:14 +0000 [info]: #0 [artifactory_service_tail] following tail of /opt/jfrog/artifactory/var/log/artifactory-service.log 2022-02-04 10:58:14 +0000 [info]: #0 [access_service_tail] following tail of /opt/jfrog/artifactory/var/log/access-service.log 2022-02-04 10:58:14 +0000 [info]: #0 fluentd worker is now running worker=0 2022-02-04 10:59:17 +0000 [error]: #0 [datadog_agent_jfrog_artifactory] Uncaught processing exception in datadog forwarder execution expired 2022-02-04 10:59:17 +0000 [warn]: #0 [datadog_agent_jfrog_artifactory] buffer flush took longer time than slow_flush_log_threshold: elapsed_time=60.008310423232615 slow_flush_log_threshold=20.0 plugin_id="datadog_agent_jfrog_artifactory" ^[[A2022-02-04 11:05:45 +0000 [error]: #0 [datadog_agent_jfrog_artifactory] Uncaught processing exception in datadog forwarder execution expired 2022-02-04 11:05:45 +0000 [warn]: #0 [datadog_agent_jfrog_artifactory] buffer flush took longer time than slow_flush_log_threshold: elapsed_time=60.02009735722095 slow_flush_log_threshold=20.0 plugin_id="datadog_agent_jfrog_artifactory" 2022-02-04 11:05:48 +0000 [error]: #0 [datadog_agent_jfrog_artifactory] Uncaught processing exception in datadog forwarder execution expired 2022-02-04 11:05:48 +0000 [warn]: #0 [datadog_agent_jfrog_artifactory] buffer flush took longer time than slow_flush_log_threshold: elapsed_time=60.007175013422966 slow_flush_log_threshold=20.0 plugin_id="datadog_agent_jfrog_artifactory" 2022-02-04 11:05:54 +0000 [error]: #0 [datadog_agent_jfrog_artifactory] Uncaught processing exception in datadog forwarder execution expired`

andrescolodrero commented 2 years ago

running same config with td-agent:

2022-02-04 11:19:56 +0000 [info]: parsing config file is succeeded path="example.conf" 2022-02-04 11:19:56 +0000 [info]: gem 'fluent-plugin-datadog' version '0.14.0' 2022-02-04 11:19:56 +0000 [info]: gem 'fluent-plugin-elasticsearch' version '5.1.4' 2022-02-04 11:19:56 +0000 [info]: gem 'fluent-plugin-flowcounter-simple' version '0.1.0' 2022-02-04 11:19:56 +0000 [info]: gem 'fluent-plugin-kafka' version '0.17.3' 2022-02-04 11:19:56 +0000 [info]: gem 'fluent-plugin-prometheus' version '2.0.2' 2022-02-04 11:19:56 +0000 [info]: gem 'fluent-plugin-prometheus_pushgateway' version '0.1.0' 2022-02-04 11:19:56 +0000 [info]: gem 'fluent-plugin-record-modifier' version '2.1.0' 2022-02-04 11:19:56 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '2.4.0' 2022-02-04 11:19:56 +0000 [info]: gem 'fluent-plugin-s3' version '1.6.1' 2022-02-04 11:19:56 +0000 [info]: gem 'fluent-plugin-sd-dns' version '0.1.0' 2022-02-04 11:19:56 +0000 [info]: gem 'fluent-plugin-systemd' version '1.0.5' 2022-02-04 11:19:56 +0000 [info]: gem 'fluent-plugin-td' version '1.1.0' 2022-02-04 11:19:56 +0000 [info]: gem 'fluent-plugin-utmpx' version '0.5.0' 2022-02-04 11:19:56 +0000 [info]: gem 'fluent-plugin-webhdfs' version '1.5.0' 2022-02-04 11:19:56 +0000 [info]: gem 'fluentd' version '1.14.3' 2022-02-04 11:19:56 +0000 [info]: using configuration file: <ROOT>

2022-02-04 11:19:56 +0000 [info]: starting fluentd-1.14.3 pid=2842111 ruby="2.7.5" 2022-02-04 11:19:56 +0000 [info]: spawn command to main: cmdline=["/opt/td-agent/bin/ruby", "-Eascii-8bit:ascii-8bit", "/usr/sbin/td-agent", "-c", "example.conf", "--under-supervisor"] 2022-02-04 11:19:57 +0000 [info]: adding filter pattern="jfrog.**" type="record_transformer" 2022-02-04 11:19:57 +0000 [info]: adding filter pattern="jfrog.callhome" type="record_transformer" 2022-02-04 11:19:57 +0000 [info]: adding filter pattern="jfrog.rt.artifactory.request" type="record_transformer" 2022-02-04 11:19:57 +0000 [info]: adding filter pattern="jfrog.rt.artifactory.request" type="record_transformer" 2022-02-04 11:19:57 +0000 [info]: adding filter pattern="jfrog.rt.artifactory.access" type="record_transformer" 2022-02-04 11:19:57 +0000 [info]: adding match pattern="jfrog.**" type="datadog" 2022-02-04 11:19:57 +0000 [info]: adding source type="tail" 2022-02-04 11:19:57 +0000 [info]: adding source type="tail" 2022-02-04 11:19:57 +0000 [info]: adding source type="exec" 2022-02-04 11:19:57 +0000 [info]: #0 starting fluentd worker pid=2842118 ppid=2842111 worker=0 2022-02-04 11:19:57 +0000 [info]: #0 [datadog_agent_jfrog_artifactory] Starting HTTP connection to https://http-intake.logs.datadoghq.com:443 with compression enabled using v2 routes 2022-02-04 11:19:57 +0000 [info]: #0 [artifactory_service_tail] following tail of /opt/jfrog/artifactory/var/log/artifactory-service.log 2022-02-04 11:19:57 +0000 [info]: #0 [access_service_tail] following tail of /opt/jfrog/artifactory/var/log/access-service.log 2022-02-04 11:19:57 +0000 [info]: #0 fluentd worker is now running worker=0 2022-02-04 11:21:36 +0000 [warn]: #0 [datadog_agent_jfrog_artifactory] got unrecoverable error in primary and no secondary error_class=NoMethodError error="undefined methoderror' for nil:NilClass" 2022-02-04 11:21:36 +0000 [warn]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluent-plugin-datadog-0.14.0/lib/fluent/plugin/out_datadog.rb:144:in rescue in write' 2022-02-04 11:21:36 +0000 [warn]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluent-plugin-datadog-0.14.0/lib/fluent/plugin/out_datadog.rb:128:inwrite' 2022-02-04 11:21:36 +0000 [warn]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.14.3/lib/fluent/plugin/output.rb:1179:in try_flush' 2022-02-04 11:21:36 +0000 [warn]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.14.3/lib/fluent/plugin/output.rb:1491:inflush_thread_run' 2022-02-04 11:21:36 +0000 [warn]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.14.3/lib/fluent/plugin/output.rb:499:in block (2 levels) in start' 2022-02-04 11:21:36 +0000 [warn]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.14.3/lib/fluent/plugin_helper/thread.rb:78:inblock in thread_create' 2022-02-04 11:21:36 +0000 [warn]: #0 [datadog_agent_jfrog_artifactory] bad chunk is moved to /tmp/fluent/backup/worker0/datadog_agent_jfrog_artifactory/5d72f6e9d141501336af2291c395f39d.log 2022-02-04 11:21:40 +0000 [warn]: #0 [datadog_agent_jfrog_artifactory] got unrecoverable error in primary and no secondary error_class=NoMethodError error="undefined method error' for nil:NilClass" 2022-02-04 11:21:40 +0000 [warn]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluent-plugin-datadog-0.14.0/lib/fluent/plugin/out_datadog.rb:144:inrescue in write' 2022-02-04 11:21:40 +0000 [warn]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluent-plugin-datadog-0.14.0/lib/fluent/plugin/out_datadog.rb:128:in write' 2022-02-04 11:21:40 +0000 [warn]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.14.3/lib/fluent/plugin/output.rb:1179:intry_flush' 2022-02-04 11:21:40 +0000 [warn]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.14.3/lib/fluent/plugin/output.rb:1491:in flush_thread_run' 2022-02-04 11:21:40 +0000 [warn]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.14.3/lib/fluent/plugin/output.rb:499:inblock (2 levels) in start' 2022-02-04 11:21:40 +0000 [warn]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.14.3/lib/fluent/plugin_helper/thread.rb:78:in block in thread_create' 2022-02-04 11:21:40 +0000 [warn]: #0 [datadog_agent_jfrog_artifactory] bad chunk is moved to /tmp/fluent/backup/worker0/datadog_agent_jfrog_artifactory/5d72f70af706f0cedcc464712c8e283c.log

remeh commented 2 years ago

datadog_agent_jfrog_artifactory is just the ID

Ah ok, got it.

2022-02-04 09:32:09 +0000 [info]: #0 [access_service_tail] following tail of /opt/jfrog/artifactory/var/log/access-service.log
2022-02-04 09:32:09 +0000 [info]: #0 fluentd worker is now running worker=0
2022-02-04 09:32:30 +0000 [error]: #0 [datadog_agent_jfrog_artifactory] Uncaught processing exception in datadog forwarder execution expired
2022-02-04 09:32:30 +0000 [warn]: #0 [datadog_agent_jfrog_artifactory] buffer flush took longer time than slow_flush_log_threshold: elapsed_time=60.02107282727957 slow_flush_log_threshold=20.0 plugin_id="datadog_agent_jfrog_artifactory"
2022-02-04 09:32:30 +0000 [error]: #0 [datadog_agent_jfrog_artifactory] Uncaught processing exception in datadog forwarder execution expired
2022-02-04 09:32:30 +0000 [error]: #0 [datadog_agent_jfrog_artifactory] Uncaught processing exception in datadog forwarder execution expired

This execution expired error message looks to me that the HTTP call is timeouting.

Are you sure your node can connect to the Datadog endpoint? Should this node use a proxy to connect to HTTP endpoints? I'm curious what is the output of: curl https://http-intake.logs.datadoghq.com on this node.

I'll try your configuration locally and see if I reproduce.

andrescolodrero commented 2 years ago

ok @remeh I have been testing to run it with erbose, but i couldnt see where is trying to connect

it is the td agent always talking to DD_DEFAULT_HTTP_ENDPOINT = "http-intake.logs.datadoghq.com" DD_DEFAULT_TCP_ENDPOINT = "intake.logs.datadoghq.com"

Datadog upgrade very often the endpoints. if this is using latest endpoind, yes, it can be stopped by proxy.

remeh commented 2 years ago

That could be it then: the domains did not change in a couple years but the HTTP route is different since 0.14.0: since this version the plugin will use https://http-intake.logs.datadoghq.com:443/api/v2/logs (while it would have hit https://http-intake.logs.datadoghq.com:443/v1/input/#{api_key} before).

The best would be for you to let these requests go through with your proxy. Alternatively, you can set the configuration field force_v1_routes to true to use the old endpoint, but I would not recommend it.

remeh commented 2 years ago

Closing because of no activity. Please re-open if that makes sense 👍