Open kokoichi206 opened 1 year ago
mod なんとかは基本有効になってるはず
最初 apache.d
とか書かれてた時 apache 側の話かと思ってたが、datadog-agent のフォルダの中での話だった
# ここ!
cd /etc/datadog-agent/conf.d/apache.d
sudo cp -p conf.yaml.example conf.yaml
sudo systemctl restart datadog-agent.service
https://us3.datadoghq.com/logs/onboarding/detected
/etc/datadog-agent$ sudo vim datadog.yaml
sudo systemctl restart datadog-agent.service
上の URL でうまくいかない時は、log ファイルのパーミッションを変更してみる。
sudo chmod 655 /var/log/apache2/ -R
下のように、ログにパーミッションエラーが出ている。
sudo tail -f /var/log/datadog/agent.log
$ cat /var/log/datadog/agent.log | grep permission
2023-01-25 04:00:32 UTC | CORE | WARN | (pkg/logs/internal/launchers/file/launcher.go:241 in launchTailers) | Could not collect files: cannot read file /var/log/apache2/access.log: stat /var/log/apache2/access.log: permission denied
2023-01-25 04:00:32 UTC | CORE | WARN | (pkg/logs/internal/launchers/file/launcher.go:241 in launchTailers) | Could not collect files: cannot read file /var/log/apache2/error.log: stat /var/log/apache2/error.log: permission denied
Index: main storage, db. Source: ddsource
API レファレンス https://docs.datadoghq.com/ja/api/latest/
Organization Settings > ACCESS >API Keys
https://github.com/opentracing/specification/blob/master/specification.md#the-opentracing-api
https://github.com/opentracing/specification/blob/master/specification.md#the-opentracing-data-model
気づいたらすごい頻度でログが書き込まれていた。
2023-01-25 23:30:13 UTC | CORE | WARN | (pkg/collector/python/datadog_agent.go:125 in LogMessage) | disk:67cc0574430a16ba | (disk.py:135) | Unable to get disk metrics for /sys/kernel/debug/tracing: [Errno 13] Permission denied: '/sys/kernel/debug/tracing'. You can exclude this mountpoint in the settings if it is invalid.
2023-01-25 23:30:28 UTC | CORE | WARN | (pkg/collector/python/datadog_agent.go:125 in LogMessage) | disk:67cc0574430a16ba | (disk.py:135) | Unable to get disk metrics for /run/user/1000/gvfs: [Errno 13] Permission denied: '/run/user/1000/gvfs'. You can exclude this mountpoint in the settings if it is invalid.
2023-01-25 23:30:28 UTC | CORE | WARN | (pkg/collector/python/datadog_agent.go:125 in LogMessage) | disk:67cc0574430a16ba | (disk.py:135) | Unable to get disk metrics for /sys/kernel/debug/tracing: [Errno 13] Permission denied: '/sys/kernel/debug/tracing'. You can exclude this mountpoint in the settings if it is invalid.
2023-01-25 23:30:29 UTC | CORE | WARN | (pkg/collector/corechecks/containers/docker/check.go:220 in runDockerCustom) | Unable to fetch tags for container: sha256:d6c21fcb8fc9611b222ec23b881e75b0b6f584389e57717a69c96d382bd52c69, err: invalid image name (is a sha256)
2023-01-25 23:30:29 UTC | CORE | WARN | (pkg/collector/corechecks/containers/docker/check.go:220 in runDockerCustom) | Unable to fetch tags for container: sha256:8783247e0de113f13e0feb6a338e34ef5b8423c756e337009d88c3b2423c5744, err: invalid image name (is a sha256)
2023-01-25 23:30:29 UTC | CORE | WARN | (pkg/collector/corechecks/containers/docker/check.go:220 in runDockerCustom) | Unable to fetch tags for container: sha256:54932d1e2b576170944902535c58a16fed7a2a4d9aaabf7fceae5fd39619b750, err: invalid image name (is a sha256)
2023-01-25 23:30:29 UTC | CORE | WARN | (pkg/collector/corechecks/containers/docker/check.go:220 in runDockerCustom) | Unable to fetch tags for container: sha256:18e13cfe20ac90eab3ac026ae7cc6120eb278ab50dab9c1f3852a4deaa037aa6, err: invalid image name (is a sha256)
Docker のコンテナ関連っぽい?ので1回止めてみるか
/etc/logrotate.d/apache2
を編集することで対応する。
$ sudo cat /etc/logrotate.d/apache2
/var/log/apache2/*.log {
daily
missingok
rotate 14
compress
delaycompress
notifempty
create 640 root adm
sharedscripts
prerotate
if [ -d /etc/logrotate.d/httpd-prerotate ]; then
run-parts /etc/logrotate.d/httpd-prerotate
fi
endscript
postrotate
if pgrep -f ^/usr/sbin/apache2 > /dev/null; then
invoke-rc.d apache2 reload 2>&1 | logger -t apache2.logrotate
fi
endscript
}
create 640 root adm
を create 644 www-data www-data
にする
# If necessary, prepend sudo -u dd-agent to the install command.
sudo -u dd-agent datadog-agent integration install -t datadog-go-pprof-scraper==1.0.2
/etc/datadog-agent/conf.d/go_pprof_scraper.d$ cat conf.yaml
## All options defined here are available to all instances.
#
init_config:
## @param service - string - optional
## Attach the tag `service:<SERVICE>` to every metric, event, and service check emitted by this integration.
##
## Additionally, this sets the default `service` for every log source.
#
# service: <SERVICE>
## Every instance is scheduled independently of the others.
#
instances:
-
## @param env - string - optional - default: prod
## env tag to apply to uploaded profiles ("env:<ENV>")
#
# env: prod
## @param pprof_url - string - required
## URL of the /debug/pprof endpoint to collect
#
pprof_url: http://myservice:1234/debug/pprof/
## @param duration - integer - optional - default: 60
## Duration of profiles, in seconds
#
# duration: 30
## @param profiles - list of strings - optional
## List of profiles to collect. Valid options are "cpu", "heap", "mutex", "block", and "goroutine"
#
# profiles:
# - cpu
# - heap
## @param cumulative - boolean - optional - default: true
## Whether to collect heap, mutex, or block profiles as cumulative profiles
## since the program started. If false, requests those profiles over the
## period specified by "duration". The profiles will hold the difference
## between the samples at the beginning and end of profiling.
##
## For the heap profile, the in-use (also known as "live heap") samples
## may be negative if "cumulative" is false. This does not display
## accurately in the profile UI, so Datadog does not recommend setting
## "cumulative" to false.
##
## In order to use profile aggregation, "cumulative" must set to false.
## Note that setting "cumulative" to false will cause the profiled
## application to use more memory in order to compute the profiles.
#
# cumulative: true
## @param tags - list of strings - optional
## A list of tags to attach to every metric and service check emitted by this instance.
##
## Learn more about tagging at https://docs.datadoghq.com/tagging
#
# tags:
# - <KEY_1>:<VALUE_1>
# - <KEY_2>:<VALUE_2>
## @param service - string - required
## Service name to tag on every profile uploaded for this instance.
##
## Overrides any `service` defined in the `init_config` section.
#
service: default-go-service
## @param min_collection_interval - number - optional - default: 1
## This changes the collection interval of the check. For more information, see:
## https://docs.datadoghq.com/developers/write_agent_check/#collection-interval
##
## This is a long-running check, and is intended to be started again as
## soon as it finishes. Setting this to a larger value will cause longer
## pauses between iterations of this check.
##
## If omitted, will default to 15 seconds.
#
min_collection_interval: 1
## @param empty_default_hostname - boolean - optional - default: false
## This forces the check to send metrics with no hostname.
##
## This is useful for cluster-level checks.
#
# empty_default_hostname: false
## @param metric_patterns - mapping - optional
## A mapping of metrics to include or exclude, with each entry being a regular expression.
##
## Metrics defined in `exclude` will take precedence in case of overlap.
#
# metric_patterns:
# include:
# - <INCLUDE_REGEX>
# exclude:
# - <EXCLUDE_REGEX>
ばり長いエラーが agent.log に出てた
2023-01-26 13:26:03 UTC | CORE | ERROR | (pkg/collector/worker/check_logger.go:69 in Error) |
check:go_pprof_scraper | Error running check: [{"message": "HTTPConnectionPool(host='myservice', port=1234):
Max retries exceeded with url: /debug/pprof/heap (Caused by NewConnectionError('<urllib3.connection.HTTPConnection
object at 0xffff4815d4c0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))",
"traceback": "Traceback (most recent call last):\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-
packages/urllib3/connection.py\", line 174, in _new_conn\n conn = connection.create_connection(\n File \"/opt/datadog-
agent/embedded/lib/python3.8/site-packages/urllib3/util/connection.py\", line 72, in create_connection\n for res in
socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):\n File \"/opt/datadog-
agent/embedded/lib/python3.8/socket.py\", line 918, in getaddrinfo\n for res in _socket.getaddrinfo(host, port, family,
type, proto, flags):\nsocket.gaierror: [Errno -3] Temporary failure in name resolution\n\nDuring handling of the above
exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/opt/datadog-
agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py\", line 703, in urlopen\n httplib_response =
self._make_request(\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py\", line
398, in _make_request\n conn.request(method, url, **httplib_request_kw)\n File \"/opt/datadog-
agent/embedded/lib/python3.8/site-packages/urllib3/connection.py\", line 239, in request\n super(HTTPConnection,
self).request(method, url, body=body, headers=headers)\n File \"/opt/datadog-
agent/embedded/lib/python3.8/http/client.py\", line 1256, in request\n self._send_request(method, url, body, headers, encode_chunked)\n File \"/opt/datadog-agent/embedded/lib/python3.8/http/client.py\", line 1302, in _send_request\n self.endheaders(body, encode_chunked=encode_chunked)\n File \"/opt/datadog-agent/embedded/lib/python3.8/http/client.py\", line 1251, in endheaders\n self._send_output(message_body, encode_chunked=encode_chunked)\n File \"/opt/datadog-agent/embedded/lib/python3.8/http/client.py\", line 1011, in _send_output\n self.send(msg)\n File \"/opt/datadog-agent/embedded/lib/python3.8/http/client.py\", line 951, in send\n self.connect()\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connection.py\", line 205, in connect\n conn = self._new_conn()\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connection.py\", line 186, in _new_conn\n raise NewConnectionError(\nurllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at
0xffff4815d4c0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/adapters.py\", line 489, in send\n resp = conn.urlopen(\n File
\"/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py\", line 787, in urlopen\n retries = retries.increment(\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/util/retry.py\", line 592, in increment\n raise MaxRetryError(_pool, url, error or ResponseError(cause))\nurllib3.exceptions.MaxRetryError:
HTTPConnectionPool(host='myservice', port=1234): Max retries exceeded with url: /debug/pprof/heap (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0xffff4815d4c0>: Failed to establish a new
connection: [Errno -3] Temporary failure in name resolution'))\n\nDuring handling of the above exception, another
exception occurred:\n\nTraceback (most recent call last):\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-
packages/datadog_checks/base/checks/base.py\", line 1122, in run\n self.check(instance)\n File \"/opt/datadog-
agent/embedded/lib/python3.8/site-packages/datadog_checks/go_pprof_scraper/check.py\", line 139, in check\n
profiles = list(executor.map(self._get_profile, self.profiles))\n File \"/opt/datadog-
agent/embedded/lib/python3.8/concurrent/futures/_base.py\", line 619, in result_iterator\n yield fs.pop().result()\n File
\"/opt/datadog-agent/embedded/lib/python3.8/concurrent/futures/_base.py\", line 444, in result\n return
self.__get_result()\n File \"/opt/datadog-agent/embedded/lib/python3.8/concurrent/futures/_base.py\", line 389, in
__get_result\n raise self._exception\n File \"/opt/datadog-agent/embedded/lib/python3.8/concurrent/futures/thread.py\",
line 57, in run\n result = self.fn(*self.args, **self.kwargs)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-
packages/datadog_checks/go_pprof_scraper/check.py\", line 108, in _get_profile\n response = self.http.get(\n File
\"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py\", line 355, in get\n
return self._request('get', url, options)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-
packages/datadog_checks/base/utils/http.py\", line 419, in _request\n response =
self.make_request_aia_chasing(request_method, method, url, new_options, persist)\n File \"/opt/datadog-
agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py\", line 425, in
make_request_aia_chasing\n response = request_method(url, **new_options)\n File \"/opt/datadog-
agent/embedded/lib/python3.8/site-packages/requests/api.py\", line 73, in get\n return request(\"get\", url,
params=params, **kwargs)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/api.py\", line 59,
in request\n return session.request(method=method, url=url, **kwargs)\n File \"/opt/datadog-
agent/embedded/lib/python3.8/site-packages/requests/sessions.py\", line 587, in request\n resp = self.send(prep,
**send_kwargs)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/sessions.py\", line 701, in
send\n r = adapter.send(request, **kwargs)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-
packages/requests/adapters.py\", line 565, in send\n raise ConnectionError(e,
request=request)\nrequests.exceptions.ConnectionError: HTTPConnectionPool(host='myservice', port=1234): Max retries
exceeded with url: /debug/pprof/heap (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at
0xffff4815d4c0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))\n"}]
2023-01-26 13:43:41 UTC | CORE | WARN | (pkg/collector/python/datadog_agent.go:125 in LogMessage) |
disk:67cc0574430a16ba | (disk.py:135) | Unable to get disk metrics for /sys/kernel/debug/tracing: [Errno 13] Permission
denied: '/sys/kernel/debug/tracing'. You can exclude this mountpoint in the settings if it is invalid.
なんかずっとログのところ見てたけど、APM のところかもしれない。。。
https://docs.datadoghq.com/ja/logs/log_collection/go/
https://www.datadoghq.com/ja/blog/go-logging/#write-your-logs-to-a-file
/usr/log/api$ ls -la
total 8
drwxr-xr-x 2 root root 4096 Jan 27 03:30 .
drwxr-xr-x 3 root root 4096 Jan 27 03:29 ..
-rw-r--r-- 1 ubuntu ubuntu 0 Jan 27 03:30 test.log
conf.d の d ってなんだ
conf, confd とかだと、デーモンの略って可能性もある https://teratail.com/questions/2920
デーモンかディレクトリかな
送れた気がする
sudo vim /etc/datadog-agent/conf.d/go.d/conf.yaml
sudo systemctl start datadog-agent
1 週間くらいで 3 万ログとか吸い上げてしまってる
とりあえず apache のログに残さないようにしてみる
# とりあえず /server-status から始まるものをはじきたい
$ sudo vim /etc/apache2/apache2.conf
LogFormat "%h %l %u %t \"%r\" %>s %O" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent
LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" combined
# https://httpd.apache.org/docs/2.2/env.html#page-header
SetEnvIf Request_URI "^/server-status" dontlog
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log common env=!dontlog
$ sudo systemctl restart apache2
How to Install Agent
https://us3.datadoghq.com/signup/agent#ubuntu
アカウント登録後、以下の指示が出るので ラズパイで叩いてみる