Pretty much copy paste but I've redistributed the metrics stuff to a dedicated file and needed to implement a Resolver interface to properly mock the lookup process.
For additional information look at the commits.
Tests done
I've provided many success and error unit tests.
[x] Unit tests succeeded
[x] E2E tests succeeded
Manual e2e test
Logs
$ go run main.go run --config .tmp/start-config.yaml
Using config file: .tmp/start-config.yaml
{"time":"2024-01-25T16:07:57.730409738+01:00","level":"INFO","source":{"function":"github.com/caas-team/sparrow/cmd.NewCmdRun.run.func1","file":"/home/installadm/dev/github/sparrow/cmd/run.go","line":82},"msg":"Running sparrow"}
{"time":"2024-01-25T16:07:57.730592738+01:00","level":"INFO","source":{"function":"github.com/caas-team/sparrow/pkg/sparrow.(*Sparrow).api.func1","file":"/home/installadm/dev/github/sparrow/pkg/sparrow/api.go","line":81},"msg":"Serving Api","addr":":8080"}
{"time":"2024-01-25T16:07:57.730575228+01:00","level":"INFO","source":{"function":"github.com/caas-team/sparrow/pkg/sparrow/targets.(*gitlabTargetManager).Reconcile","file":"/home/installadm/dev/github/sparrow/pkg/sparrow/targets/gitlab.go","line":80},"msg":"Starting global gitlabTargetManager reconciler"}
{"time":"2024-01-25T16:07:57.730584526+01:00","level":"INFO","source":{"function":"github.com/caas-team/sparrow/pkg/config.(*FileLoader).Run","file":"/home/installadm/dev/github/sparrow/pkg/config/file.go","line":48},"msg":"Reading config from file","file":"./.tmp/run-config.yaml"}
{"time":"2024-01-25T16:07:57.730894566+01:00","level":"WARN","source":{"function":"github.com/caas-team/sparrow/pkg/sparrow.(*Sparrow).registerCheck","file":"/home/installadm/dev/github/sparrow/pkg/sparrow/run.go","line":232},"msg":"Check is not registered","name":"health"}
{"time":"2024-01-25T16:07:57.730911446+01:00","level":"WARN","source":{"function":"github.com/caas-team/sparrow/pkg/sparrow.(*Sparrow).registerCheck","file":"/home/installadm/dev/github/sparrow/pkg/sparrow/run.go","line":232},"msg":"Check is not registered","name":"latency"}
{"time":"2024-01-25T16:07:57.73101715+01:00","level":"INFO","source":{"function":"github.com/caas-team/sparrow/pkg/checks/dns.(*DNS).Run","file":"/home/installadm/dev/github/sparrow/pkg/checks/dns/dns.go","line":86},"msg":"Starting dns check","interval":"20s"}
{"time":"2024-01-25T16:08:17.81579971+01:00","level":"ERROR","source":{"function":"github.com/caas-team/sparrow/pkg/checks/dns.getDNS","file":"/home/installadm/dev/github/sparrow/pkg/checks/dns/dns.go","line":242},"msg":"Error while looking up address","address":"www.google.com","error":"lookup www.google.com on 127.0.0.53:53: no such host"}
{"time":"2024-01-25T16:08:17.815863855+01:00","level":"WARN","source":{"function":"github.com/caas-team/sparrow/pkg/checks/dns.(*DNS).check.Retry.func3","file":"/home/installadm/dev/github/sparrow/internal/helper/retry.go","line":49},"msg":"Effector call failed, retrying in 1s"}
{"time":"2024-01-25T16:08:18.884199559+01:00","level":"ERROR","source":{"function":"github.com/caas-team/sparrow/pkg/checks/dns.getDNS","file":"/home/installadm/dev/github/sparrow/pkg/checks/dns/dns.go","line":242},"msg":"Error while looking up address","address":"www.google.com","error":"lookup www.google.com on 127.0.0.53:53: no such host"}
{"time":"2024-01-25T16:08:18.884392831+01:00","level":"WARN","source":{"function":"github.com/caas-team/sparrow/pkg/checks/dns.(*DNS).check.Retry.func3","file":"/home/installadm/dev/github/sparrow/internal/helper/retry.go","line":49},"msg":"Effector call failed, retrying in 2s"}
{"time":"2024-01-25T16:08:20.954973345+01:00","level":"ERROR","source":{"function":"github.com/caas-team/sparrow/pkg/checks/dns.getDNS","file":"/home/installadm/dev/github/sparrow/pkg/checks/dns/dns.go","line":242},"msg":"Error while looking up address","address":"www.google.com","error":"lookup www.google.com on 127.0.0.53:53: no such host"}
{"time":"2024-01-25T16:08:20.955029847+01:00","level":"WARN","source":{"function":"github.com/caas-team/sparrow/pkg/checks/dns.(*DNS).check.Retry.func3","file":"/home/installadm/dev/github/sparrow/internal/helper/retry.go","line":49},"msg":"Effector call failed, retrying in 4s"}
{"time":"2024-01-25T16:08:25.026287794+01:00","level":"ERROR","source":{"function":"github.com/caas-team/sparrow/pkg/checks/dns.getDNS","file":"/home/installadm/dev/github/sparrow/pkg/checks/dns/dns.go","line":242},"msg":"Error while looking up address","address":"www.google.com","error":"lookup www.google.com on 127.0.0.53:53: no such host"}
{"time":"2024-01-25T16:08:25.026327665+01:00","level":"WARN","source":{"function":"github.com/caas-team/sparrow/pkg/checks/dns.(*DNS).check.func2","file":"/home/installadm/dev/github/sparrow/pkg/checks/dns/dns.go","line":208},"msg":"Error while looking up address","target":"www.google.com","error":"lookup www.google.com on 127.0.0.53:53: no such host"}
Exposed Metrics (redacted IPs)
# HELP sparrow_dns_duration_seconds Duration of DNS resolution attempts in seconds.
# TYPE sparrow_dns_duration_seconds gauge
sparrow_dns_duration_seconds{target="10.x.x.x"} 0.000644266
sparrow_dns_duration_seconds{target="www.google.com"} 0
sparrow_dns_duration_seconds{target="www.t-systems.com"} 0.018097008
sparrow_dns_duration_seconds{target="www.telekom.de"} 0.017367603
# HELP sparrow_dns_response_time_seconds Histogram of response times for DNS checks in seconds.
# TYPE sparrow_dns_response_time_seconds histogram
sparrow_dns_response_time_seconds_bucket{target="10.x.x.x",le="0.005"} 1
sparrow_dns_response_time_seconds_bucket{target="10.x.x.x",le="0.01"} 1
sparrow_dns_response_time_seconds_bucket{target="10.x.x.x",le="0.025"} 1
sparrow_dns_response_time_seconds_bucket{target="10.x.x.x",le="0.05"} 1
sparrow_dns_response_time_seconds_bucket{target="10.x.x.x",le="0.1"} 1
sparrow_dns_response_time_seconds_bucket{target="10.x.x.x",le="0.25"} 1
sparrow_dns_response_time_seconds_bucket{target="10.x.x.x",le="0.5"} 1
sparrow_dns_response_time_seconds_bucket{target="10.x.x.x",le="1"} 1
sparrow_dns_response_time_seconds_bucket{target="10.x.x.x",le="2.5"} 1
sparrow_dns_response_time_seconds_bucket{target="10.x.x.x",le="5"} 1
sparrow_dns_response_time_seconds_bucket{target="10.x.x.x",le="10"} 1
sparrow_dns_response_time_seconds_bucket{target="10.x.x.x",le="+Inf"} 1
sparrow_dns_response_time_seconds_sum{target="10.x.x.x"} 0.000644266
sparrow_dns_response_time_seconds_count{target="10.x.x.x"} 1
sparrow_dns_response_time_seconds_bucket{target="www.google.com",le="0.005"} 1
sparrow_dns_response_time_seconds_bucket{target="www.google.com",le="0.01"} 1
sparrow_dns_response_time_seconds_bucket{target="www.google.com",le="0.025"} 1
sparrow_dns_response_time_seconds_bucket{target="www.google.com",le="0.05"} 1
sparrow_dns_response_time_seconds_bucket{target="www.google.com",le="0.1"} 1
sparrow_dns_response_time_seconds_bucket{target="www.google.com",le="0.25"} 1
sparrow_dns_response_time_seconds_bucket{target="www.google.com",le="0.5"} 1
sparrow_dns_response_time_seconds_bucket{target="www.google.com",le="1"} 1
sparrow_dns_response_time_seconds_bucket{target="www.google.com",le="2.5"} 1
sparrow_dns_response_time_seconds_bucket{target="www.google.com",le="5"} 1
sparrow_dns_response_time_seconds_bucket{target="www.google.com",le="10"} 1
sparrow_dns_response_time_seconds_bucket{target="www.google.com",le="+Inf"} 1
sparrow_dns_response_time_seconds_sum{target="www.google.com"} 0
sparrow_dns_response_time_seconds_count{target="www.google.com"} 1
sparrow_dns_response_time_seconds_bucket{target="www.t-systems.com",le="0.005"} 0
sparrow_dns_response_time_seconds_bucket{target="www.t-systems.com",le="0.01"} 0
sparrow_dns_response_time_seconds_bucket{target="www.t-systems.com",le="0.025"} 1
sparrow_dns_response_time_seconds_bucket{target="www.t-systems.com",le="0.05"} 1
sparrow_dns_response_time_seconds_bucket{target="www.t-systems.com",le="0.1"} 1
sparrow_dns_response_time_seconds_bucket{target="www.t-systems.com",le="0.25"} 1
sparrow_dns_response_time_seconds_bucket{target="www.t-systems.com",le="0.5"} 1
sparrow_dns_response_time_seconds_bucket{target="www.t-systems.com",le="1"} 1
sparrow_dns_response_time_seconds_bucket{target="www.t-systems.com",le="2.5"} 1
sparrow_dns_response_time_seconds_bucket{target="www.t-systems.com",le="5"} 1
sparrow_dns_response_time_seconds_bucket{target="www.t-systems.com",le="10"} 1
sparrow_dns_response_time_seconds_bucket{target="www.t-systems.com",le="+Inf"} 1
sparrow_dns_response_time_seconds_sum{target="www.t-systems.com"} 0.018097008
sparrow_dns_response_time_seconds_count{target="www.t-systems.com"} 1
sparrow_dns_response_time_seconds_bucket{target="www.telekom.de",le="0.005"} 0
sparrow_dns_response_time_seconds_bucket{target="www.telekom.de",le="0.01"} 0
sparrow_dns_response_time_seconds_bucket{target="www.telekom.de",le="0.025"} 1
sparrow_dns_response_time_seconds_bucket{target="www.telekom.de",le="0.05"} 1
sparrow_dns_response_time_seconds_bucket{target="www.telekom.de",le="0.1"} 1
sparrow_dns_response_time_seconds_bucket{target="www.telekom.de",le="0.25"} 1
sparrow_dns_response_time_seconds_bucket{target="www.telekom.de",le="0.5"} 1
sparrow_dns_response_time_seconds_bucket{target="www.telekom.de",le="1"} 1
sparrow_dns_response_time_seconds_bucket{target="www.telekom.de",le="2.5"} 1
sparrow_dns_response_time_seconds_bucket{target="www.telekom.de",le="5"} 1
sparrow_dns_response_time_seconds_bucket{target="www.telekom.de",le="10"} 1
sparrow_dns_response_time_seconds_bucket{target="www.telekom.de",le="+Inf"} 1
sparrow_dns_response_time_seconds_sum{target="www.telekom.de"} 0.017367603
sparrow_dns_response_time_seconds_count{target="www.telekom.de"} 1
# HELP sparrow_dns_status Specifies if the target can be resolved.
# TYPE sparrow_dns_status gauge
sparrow_dns_status{target="10.x.x.x"} 1
sparrow_dns_status{target="www.google.com"} 0
sparrow_dns_status{target="www.t-systems.com"} 1
sparrow_dns_status{target="www.telekom.de"} 1
API Schema
paths:
/v1/metrics/dns:
description: dns
get:
tags:
- Metrics
- dns
description: Returns the performance data for check dns
responses:
"200":
description: Metrics for check dns
content:
application/json:
schema:
type: object
properties:
data:
type: object
additionalProperties:
has: null
schema:
type: object
properties:
Error:
type: string
Resolved:
type: array
items:
type: string
Total:
type: number
format: double
error:
type: string
timestamp:
type: string
format: date-time
Concerns
If the host system's nameserver can't lookup the address (see in my metrics before on www.google.com) because it's behind a proxy that does the name resolution, the check fails. The user should have the ability to specify the nameserver.
The concerns of #81 but as you've already pointed out it shouldn't be part of the dns check MVP
Motivation
To check if an address/host can be looked up.
Closes #81
Changes
Pretty much copy paste but I've redistributed the metrics stuff to a dedicated file and needed to implement a
Resolver
interface to properly mock the lookup process.For additional information look at the commits.
Tests done
I've provided many success and error unit tests.
Manual e2e test
Logs
Exposed Metrics (redacted IPs)
API Schema
Concerns
TODO