CastawayLabs / cachet-monitor

Distributed monitoring plugin for CachetHQ
https://castawaylabs.github.io/cachet-monitor/
MIT License
439 stars 127 forks source link

invalid memory address or nil pointer dereference #79

Open Kostecki opened 6 years ago

Kostecki commented 6 years ago

I've been spending my day trying to get Cachet and cachet-monitor to work together, but the result has been limited at best.

I've set up my config:

{
  "api": {
    "url": "http://status.mydomain.dk/api/v1",
    "token": "SuperSecret",
    "insecure": true
  },
  "date_format": "02/01/2006 15:04:05 MST",
  "monitors": [
    {
      "name": "HTTP: mydomain.dk",
      "target": "https://www.mydomain.dk",
      "strict": true,
      "method": "GET",
      "component_id": 2,
      "template": {
        "investigating": {
          "subject": "{{ .Monitor.Name }}",
          "message": "{{ .Monitor.Name }} check **failed** (server time: {{ .now }})\n\n{{ .FailReason }}"
        },
        "fixed": {
          "subject": "Issue was resolved"
        }
      },
      "interval": 10,
      "timeout": 1,
      "threshold": 80,
      "expected_status_code": 200,
      "expected_body": "Velkommen til mydomain"
    },
    {
      "name": "HTTP: mydomain.us",
      "target": "https://www.mydomain.us",
      "strict": false,
      "method": "GET",
      "component_id": 7,
      "template": {
        "investigating": {
          "subject": "{{ .Monitor.Name }}",
          "message": "{{ .Monitor.Name }} check **failed** (server time: {{ .now }})\n\n{{ .FailReason }}"
        },
        "fixed": {
          "subject": "Issue was resolved"
        }
      },
      "interval": 10,
      "timeout": 1,
      "threshold": 80,
      "expected_status_code": 200,
      "expected_body": "Welcome to mydomain"
    },
    {
      "name": "DNS: A",
      "target": "mydomain.dk.",
      "question": "a",
      "type": "dns",
      "component_id": 4,
      "template": {
        "investigating": {
          "investigating": {
            "subject": "{{ .Monitor.Name }}",
            "message": "{{ .Monitor.Name }} check **failed** (server time: {{ .now }})\n\n{{ .FailReason }}"
          },
          "fixed": {
            "subject": "Issue was resolved"
          }
        }
      },
      "interval": 10,
      "timeout": 5,
      "threshold": 80,
      "dns": "8.8.4.4:53",
      "answers": [
        {
          "exact": "12.3.456.1"
        }
      ]
    }
  ]
}

I've deliberately set the exact IP of the DNS: A to a wrong one to simulate something being wrong and mydomain.us returns a 404.

I let it run for a bit and then it crashes - even though it seems like it does figured out that some of my services are in fact down:

INFO[0000] System: statusPage                           
INFO[0000] API: http://status.mydomain.dk/api/v1     
INFO[0000] Monitors: 7

INFO[0000] Pinging cachet                               
INFO[0000] Ping OK                                      
INFO[0000] Starting Monitor #0:                         
INFO[0000] Features: 
 - Type: http
 - Name: HTTP: mydomain.dk
 - Method: GET
INFO[0000] Starting Monitor #1:                         
INFO[0000] Features: 
 - Type: http
 - Name: HTTP: mydomain.us
 - Method: GET
INFO[0000] Starting Monitor #2:                         
INFO[0000] Features: 
 - Type: dns
 - Name: DNS: A      
WARN[0010] DNS check failed: { <nil> 12.3.456.1}. Not found in any of [mydomain.dk. 12170   IN  A   12.3.456.78] 
INFO[0010] monitor down 100.00%/80.00%                   monitor="DNS: A" time="14/09/2017 15:57:03 CEST"
INFO[0010] monitor down 100.00%/80.00%                   monitor="HTTP: mydomain.us" time="14/09/2017 15:57:03 CEST"
INFO[0010] monitor is up                                 monitor="HTTP: mydomain.dk" time="14/09/2017 15:57:03 CEST"
WARN[0020] DNS check failed: { <nil> 12.3.456.1}. Not found in any of [mydomain.dk. 8171    IN  A   12.3.456.78] 
INFO[0020] monitor down 100.00%/80.00%                   monitor="DNS: A" time="14/09/2017 15:57:13 CEST"
INFO[0020] monitor down 100.00%/80.00%                   monitor="HTTP: mydomain.us" time="14/09/2017 15:57:13 CEST"
INFO[0020] monitor is up                                 monitor="HTTP: mydomain.dk" time="14/09/2017 15:57:13 CEST"
WARN[0030] DNS check failed: { <nil> 12.3.456.1}. Not found in any of [mydomain.dk. 16581   IN  A   12.3.456.78] 
INFO[0030] monitor down 100.00%/80.00%                   monitor="DNS: A" time="14/09/2017 15:57:23 CEST"
INFO[0030] monitor down 100.00%/80.00%                   monitor="HTTP: mydomain.us" time="14/09/2017 15:57:23 CEST"
INFO[0030] monitor is up                                 monitor="HTTP: mydomain.dk" time="14/09/2017 15:57:23 CEST"
INFO[0040] monitor is up                                 monitor="DNS: NS" time="14/09/2017 15:57:33 CEST"
WARN[0040] DNS check failed: { <nil> 12.3.456.1}. Not found in any of [mydomain.dk. 21542   IN  A   12.3.456.78] 
INFO[0040] monitor down 100.00%/80.00%                   monitor="DNS: A" time="14/09/2017 15:57:33 CEST"
INFO[0040] monitor is up                                 monitor="HTTP: mydomain.dk" time="14/09/2017 15:57:33 CEST"
INFO[0040] monitor down 75.00%/80.00%                    monitor="HTTP: mydomain.us" time="14/09/2017 15:57:33 CEST"
WARN[0050] DNS check failed: { <nil> 12.3.456.1}. Not found in any of [mydomain.dk. 16561   IN  A   12.3.456.78] 
INFO[0050] monitor down 100.00%/80.00%                   monitor="DNS: A" time="14/09/2017 15:57:43 CEST"
INFO[0050] monitor down 60.00%/80.00%                    monitor="HTTP: mydomain.us" time="14/09/2017 15:57:43 CEST"
INFO[0050] monitor is up                                 monitor="HTTP: mydomain.dk" time="14/09/2017 15:57:43 CEST"
WARN[0060] DNS check failed: { <nil> 12.3.456.1}. Not found in any of [mydomain.dk. 9674    IN  A   12.3.456.78] 
INFO[0060] monitor down 100.00%/80.00%                   monitor="DNS: A" time="14/09/2017 15:57:53 CEST"
INFO[0060] monitor down 50.00%/80.00%                    monitor="HTTP: mydomain.us" time="14/09/2017 15:57:53 CEST"
INFO[0060] monitor is up                                 monitor="HTTP: mydomain.dk" time="14/09/2017 15:57:53 CEST"
INFO[0070] monitor is up                                 monitor="DNS: NS" time="14/09/2017 15:58:03 CEST"
WARN[0070] DNS check failed: { <nil> 12.3.456.1}. Not found in any of [mydomain.dk. 9142    IN  A   12.3.456.78] 
INFO[0070] monitor down 100.00%/80.00%                   monitor="DNS: A" time="14/09/2017 15:58:03 CEST"
INFO[0070] monitor down 42.86%/80.00%                    monitor="HTTP: mydomain.us" time="14/09/2017 15:58:03 CEST"
INFO[0070] monitor is up                                 monitor="HTTP: mydomain.dk" time="14/09/2017 15:58:03 CEST"
WARN[0080] DNS check failed: { <nil> 12.3.456.1}. Not found in any of [mydomain.dk. 18372   IN  A   12.3.456.78] 
INFO[0080] monitor down 100.00%/80.00%                   monitor="DNS: A" time="14/09/2017 15:58:13 CEST"
INFO[0080] monitor down 37.50%/80.00%                    monitor="HTTP: mydomain.us" time="14/09/2017 15:58:13 CEST"
INFO[0080] monitor is up                                 monitor="HTTP: mydomain.dk" time="14/09/2017 15:58:13 CEST"
INFO[0090] monitor is up                                 monitor="DNS: NS" time="14/09/2017 15:58:23 CEST"
WARN[0090] DNS check failed: { <nil> 12.3.456.1}. Not found in any of [mydomain.dk. 19670   IN  A   12.3.456.78] 
INFO[0090] monitor down 100.00%/80.00%                   monitor="DNS: A" time="14/09/2017 15:58:23 CEST"
INFO[0090] monitor down 33.33%/80.00%                    monitor="HTTP: mydomain.us" time="14/09/2017 15:58:23 CEST"
INFO[0090] monitor is up                                 monitor="HTTP: mydomain.dk" time="14/09/2017 15:58:23 CEST"
WARN[0100] DNS check failed: { <nil> 12.3.456.1}. Not found in any of [mydomain.dk. 8062    IN  A   12.3.456.78] 
WARN[0100] DNS: A is now saturated                      
INFO[0100] monitor down 100.00%/80.00%                   monitor="DNS: A" time="14/09/2017 15:58:33 CEST"
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
    panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x6a6839]

goroutine 54 [running]:
panic(0x7a3340, 0xc420014090)
    /usr/local/Cellar/go/1.7.5/libexec/src/runtime/panic.go:500 +0x1a1
text/template.errRecover(0xc420065b90)
    /usr/local/Cellar/go/1.7.5/libexec/src/text/template/exec.go:140 +0x2ad
panic(0x7a3340, 0xc420014090)
    /usr/local/Cellar/go/1.7.5/libexec/src/runtime/panic.go:458 +0x243
text/template.(*Template).execute(0x0, 0x99a820, 0xc42004d110, 0x7a1560, 0xc42045d590, 0x0, 0x0)
    /usr/local/Cellar/go/1.7.5/libexec/src/text/template/exec.go:186 +0x1b9
text/template.(*Template).Execute(0x0, 0x99a820, 0xc42004d110, 0x7a1560, 0xc42045d590, 0xc42045d590, 0xc420065c18)
    /usr/local/Cellar/go/1.7.5/libexec/src/text/template/exec.go:175 +0x53
github.com/castawaylabs/cachet-monitor.(*MessageTemplate).exec(0xc42014c058, 0x0, 0x7a1560, 0xc42045d590, 0x40d99b, 0x9b9f70)
    /Users/m/p/go/src/github.com/castawaylabs/cachet-monitor/template.go:47 +0x6e
github.com/castawaylabs/cachet-monitor.(*MessageTemplate).Exec(0xc42014c058, 0x7a1560, 0xc42045d590, 0xc420065d98, 0xc42045f5f0, 0x2, 0x18)
    /Users/m/p/go/src/github.com/castawaylabs/cachet-monitor/template.go:41 +0x4c
github.com/castawaylabs/cachet-monitor.(*AbstractMonitor).AnalyseData(0xc42014c000)
    /Users/m/p/go/src/github.com/castawaylabs/cachet-monitor/monitor.go:210 +0x828
github.com/castawaylabs/cachet-monitor.(*AbstractMonitor).tick(0xc42014c000, 0x9a29e0, 0xc42014c000)
    /Users/m/p/go/src/github.com/castawaylabs/cachet-monitor/monitor.go:161 +0x170
github.com/castawaylabs/cachet-monitor.(*AbstractMonitor).ClockStart(0xc42014c000, 0xc420086680, 0x9a29e0, 0xc42014c000, 0xc420149a50)
    /Users/m/p/go/src/github.com/castawaylabs/cachet-monitor/monitor.go:125 +0x196
created by main.main
    /Users/m/p/go/src/github.com/castawaylabs/cachet-monitor/cli/main.go:96 +0x873

I'm sure that i'm doing something wrong, but what is it? I'm pretty lost here and would love some assistance.

axnsan12 commented 6 years ago

This might come a bit late, but I think it crashes because you have no message in your fixed template for the DNS. For HTTP it's not a problem because there are defaults set (here), but DNS sets no such defaults.

I also had the same problem and fixed it by providing a complete template.

Kostecki commented 6 years ago

Always better late then never! And that's an interesting idea - i'll definitely give it at try later.