inCaller / prometheus_bot

Telegram bot for prometheus alerting
MIT License
394 stars 182 forks source link

panic: runtime error: invalid memory address or nil pointer dereference #88

Closed stefan04 closed 2 years ago

stefan04 commented 2 years ago

Hi, I have loaded the current code onto my Raspberry Pi4 and then compiled it. However, after a make test I get the following error when using the templates from this project.

Excerpts from the bot.log

2022/03/16 20:07:04 HTML is valid, sending it...
2022/03/16 20:07:04 +---------------  F I N A L   M E S S A G E  ---------------+
2022/03/16 20:07:04 <a href='https://alert-manager.example.com/#/alerts?receiver=admins'>[FIRING:1]</a>
grouped by: alertname=<code>something_happend</code>, instance=<code>server01.int:9100</code>
labels: env=<code>prod</code>, job=<code>node</code>, service=<code>prometheus_bot</code>, severity=<code>warning</code>, supervisor=<code>runit</code>
summary: <code>runit service prometheus_bot restarted, server01.int:9100</code>
<a href='https://example.com/graph#...'>server01.int[node]</a>
2022/03/16 20:07:04 +-----------------------------------------------------------+
[GIN] 2022/03/16 - 20:07:04 | 200 |   65.982888ms |       127.0.0.1 | POST     "/alert/-4*******3"
****** Run prometheus_bot with template testdata/default.tmpl ******
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x45db1c]

goroutine 1 [running]:
main.main()
        /usr/src/prometheus_bot/main.go:418 +0x31c

This error only occurs in conjunction with the templates. Sending without using a template works.

Excerpts from the bot.log

2022/03/16 20:07:04 HTML is valid, sending it...
2022/03/16 20:07:04 +---------------  F I N A L   M E S S A G E  ---------------+
2022/03/16 20:07:04 <a href='http://alert.greco.cf/alert-manager/#/alerts?receiver=telegram_bot'>[FIRING:11]</a>
grouped by: scada_uuid=<code>483b197c-7fe8-11e6-b772-acb57db47f23</code>
labels:
<a href='http://localhost.localdomain:9090/graph?g0.expr=linux_loadavg%7Bmode%3D%2215min%22%7D+%3E+%283+%2A+10+%2A+4%29&g0.tab=0'>localhost[statsd]</a>, <a href='http://localhost.localdomain:9090/graph?g0.expr=linux_memory%7Bmode%3D%22memavailable%22%7D+%3E+%281024+%2A+100%29&g0.tab=0'>localhost[statsd]</a>, <a href='http://localhost.localdomain:9090/graph?g0.expr=100+-+%28avg%28irate%28linux_stats_cpu%7Bmode%3D%22idle%22%7D%5B2m%5D%29%29+BY+%28scada_uuid%29%29+%3E+60&g0.tab=0'></a>, <a href='http://localhost.localdomain:9090/graph?g0.expr=linux_loadavg%7Bmode%3D%221min%22%7D+%3E+%288+%2A+10+%2A+4%29&g0.tab=0'>localhost[statsd]</a>, <a href='http://localhost.localdomain:9090/graph?g0.expr=linux_loadavg%7Bmode%3D%221min%22%7D+%3E+%2810+%2A+10+%2A+4%29&g0.tab=0'>localhost[statsd]</a>, <a href='http://localhost.localdomain:9090/graph?g0.expr=linux_loadavg%7Bmode%3D%225min%22%7D+%3E+%285+%2A+10+%2A+4%29&g0.tab=0'>localhost[statsd]</a>, <a href='http://localhost.localdomain:9090/graph?g0.expr=linux_loadavg%7Bmode%3D%225min%22%7D+%3E+%288+%2A+10+%2A+4%29&g0.tab=0'>localhost[statsd]</a>, <a href='http://localhost.localdomain:9090/graph?g0.expr=linux_loadavg%7Bmode%3D%2215min%22%7D+%3E+%282+%2A+10+%2A+4%29&g0.tab=0'>localhost[statsd]</a>, <a href='http://localhost.localdomain:9090/graph?g0.expr=linux_loadavg%7Bmode%3D%2215min%22%7D+%3E+%282+%2A+10+%2A+4%29&g0.tab=0'>localhost[statsd]</a>, <a href='http://localhost.localdomain:9090/graph?g0.expr=linux_loadavg%7Bmode%3D%2215min%22%7D+%3E+%282+%2A+10+%2A+4%29&g0.tab=0'>localhost[statsd]</a>, <a href='http://localhost.localdomain:9090/graph?g0.expr=linux_loadavg%7Bmode%3D%2215min%22%7D+%3E+%282+%2A+10+%2A+4%29&g0.tab=0'>localhost[statsd]</a>
2022/03/16 20:07:04 +-----------------------------------------------------------+
[GIN] 2022/03/16 - 20:07:04 | 200 |   53.328811ms |       127.0.0.1 | POST     "/alert/-4*******3"
2022/03/16 20:07:04 Bot alert post: -4*******3
2022/03/16 20:07:04 +------------------  A L E R T  J S O N  -------------------+
2022/03/16 20:07:04 {"alerts":[{"annotations":{"summary":"Oops, something happend!"},"endsAt":"0001-01-01T00:00:00Z","generatorURL":"https://example.com/graph#...","labels":{"alertname":"something_happend","env":"prod","instance":"server01.int:9100","job":"node","service":"prometheus_bot","severity":"warning","supervisor":"runit"},"startsAt":"2016-04-27T20:46:37.903Z","status":"firing"}],"commonAnnotations":{"summary":"runit service prometheus_bot restarted, server01.int:9100"},"commonLabels":{"alertname":"something_happend","env":"prod","instance":"server01.int:9100","job":"node","service":"prometheus_bot","severity":"warning","supervisor":"runit"},"externalURL":"https://alert-manager.example.com","groupKey":0,"groupLabels":{"alertname":"something_happend","instance":"server01.int:9100"},"receiver":"admins","status":"firing","version":0}
2022/03/16 20:07:04 +-----------------------------------------------------------+

Output after make test

make test
go build -o prometheus_bot
prove -v
t/curl.t ..
1..25
ok 1 - noGenURL.json template none
ok 2 - emptyValue.json template none
ok 3 - production_example.json template none
ok 4 - simpe.json template none
ok 5 - big_output.json template none
not ok 6 - noGenURL.json template default.tmpl
not ok 7 - emptyValue.json template default.tmpl
not ok 8 - production_example.json template default.tmpl
not ok 9 - simpe.json template default.tmpl
not ok 10 - big_output.json template default.tmpl
not ok 11 - noGenURL.json template malformed_html.tmpl
not ok 12 - emptyValue.json template malformed_html.tmpl
not ok 13 - production_example.json template malformed_html.tmpl
not ok 14 - simpe.json template malformed_html.tmpl
not ok 15 - big_output.json template malformed_html.tmpl
not ok 16 - noGenURL.json template detailed_vars.tmpl
not ok 17 - emptyValue.json template detailed_vars.tmpl
not ok 18 - production_example.json template detailed_vars.tmpl
not ok 19 - simpe.json template detailed_vars.tmpl
not ok 20 - big_output.json template detailed_vars.tmpl
not ok 21 - noGenURL.json template production_example.tmpl
not ok 22 - emptyValue.json template production_example.tmpl
not ok 23 - production_example.json template production_example.tmpl
not ok 24 - simpe.json template production_example.tmpl
not ok 25 - big_output.json template production_example.tmpl
Failed 20/25 subtests

Test Summary Report
-------------------
t/curl.t (Wstat: 0 Tests: 25 Failed: 20)
  Failed tests:  6-25
Files=1, Tests=25, 16 wallclock secs ( 0.06 usr  0.02 sys +  0.73 cusr  0.79 csys =  1.60 CPU)
Result: FAIL
make: *** [Makefile:6: test] Fehler 1

I have already used this code in July 2021. The commit at that time was:

commit 2dc161747f6a6ee8afbc8afd1b7e02f0bc195daf (HEAD -> master, origin/master, origin/HEAD)
Merge: ac4533f 1d366fe
Author: Roman Belyakovsky <ihryamzik@gmail.com>
Date:   Sun May 30 23:20:07 2021 +0300

    Merge pull request #64 from dmaes/master

    Clarify documentation about `-` in chat_id and url

This code from May 2021 still works today.

I can compile it in the same system environment and use it with the same templates.

There are no errors.

Where is the difference to today's version?

My current environment

$ uname -a
Linux node5.local 5.10.92-v8+ #1514 SMP PREEMPT Mon Jan 17 17:39:38 GMT 2022 aarch64 GNU/Linux

$ cat /etc/*rel*
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

$ go version
go version go1.17.2 linux/arm64

Can you help me fix the problem?

Greetings...

Stefan

binhvcom commented 2 years ago

i have the same error when compile the latest code, please fix this

ykslol commented 2 years ago

The last #87 MR broke bot. Have the same issue

kurgalinn commented 2 years ago

https://github.com/inCaller/prometheus_bot/pull/90 - PR with fixes

hryamzik commented 2 years ago

@stefan04 can you check that latest code works for you now, please?

stefan04 commented 2 years ago

This version works perfectly!

Same environment as before.

go version
go version go1.17.2 linux/arm64
uname -a
Linux compiler 5.10.92-v8+ #1514 SMP PREEMPT Mon Jan 17 17:39:38 GMT 2022 aarch64 GNU/Linux

after make clean and make....

make test
go build -o prometheus_bot
prove -v
t/curl.t ..
1..25
ok 1 - noGenURL.json template none
ok 2 - production_example.json template none
ok 3 - simpe.json template none
ok 4 - emptyValue.json template none
ok 5 - big_output.json template none
ok 6 - noGenURL.json template production_example.tmpl
ok 7 - production_example.json template production_example.tmpl
ok 8 - simpe.json template production_example.tmpl
ok 9 - emptyValue.json template production_example.tmpl
ok 10 - big_output.json template production_example.tmpl
ok 11 - noGenURL.json template malformed_html.tmpl
ok 12 - production_example.json template malformed_html.tmpl
ok 13 - simpe.json template malformed_html.tmpl
ok 14 - emptyValue.json template malformed_html.tmpl
ok 15 - big_output.json template malformed_html.tmpl
ok 16 - noGenURL.json template default.tmpl
ok 17 - production_example.json template default.tmpl
ok 18 - simpe.json template default.tmpl
ok 19 - emptyValue.json template default.tmpl
ok 20 - big_output.json template default.tmpl
ok 21 - noGenURL.json template detailed_vars.tmpl
ok 22 - production_example.json template detailed_vars.tmpl
ok 23 - simpe.json template detailed_vars.tmpl
ok 24 - emptyValue.json template detailed_vars.tmpl
ok 25 - big_output.json template detailed_vars.tmpl
ok
All tests successful.
Files=1, Tests=25, 120 wallclock secs ( 0.08 usr  0.02 sys +  1.86 cusr  1.63 csys =  3.59 CPU)
Result: PASS

Thank you very much for your support! Greetings...

Stefan

stefan04 commented 2 years ago

Another hint...

The test script should pause more when sending messages.

Telegram was not always happy about the many messages.

cat bot.log |grep "Error"
2022/06/23 22:21:19 Error sending message: Too Many Requests: retry after 14
2022/06/23 22:21:21 Error sending message: Too Many Requests: retry after 12
2022/06/23 22:21:23 Error sending message: Too Many Requests: retry after 10
2022/06/23 22:21:23 Error sending message: Too Many Requests: retry after 10