lesovsky / pgscv

pgSCV is a multi-purpose monitoring agent and metrics exporter
BSD 3-Clause "New" or "Revised" License
165 stars 28 forks source link

invalid memory address or nil pointer - postgres/statements #4

Closed Tri0L closed 3 years ago

Tri0L commented 3 years ago

Errors and Logs

Jun  7 18:25:29 pgscv[27868]: {"level":"error","service":"pgscv","time":"2021-06-07T18:25:29+03:00","message":"create const metric failed: label value \"select brand_name , case_type , avg(show_cases) as avg_shown_cases from ( select shop_id , brand_name , case_type , avg(shown_cases_count) as show_cases from ( select shop_id , brand_name , date , case_type , sum(case when date between created::date and (case when last_action_datetime_? < close_datetime_? then last_action_datetime_? else close_datetime_? end) then ? else ? end) as shown_cases_count from ( select c.id , c.plu , case when c.plu IN (?) then ? when c.plu IN (?) then ? when c.plu IN (?) then ? when c.plu IN (?) then ? when c.plu IN (?) then ? when c.plu IN (?) then ? when c.plu IN (?) then ? when c.plu IN (?) then ? when c.plu IN (?) then ? else ? end as brand_name , c.type as case_type , c.shop_id , c.created , c.last_action_datetime , c.close_datetime , c.action_type , c.close_reason , case when c.last_action_datetime is null then ?::date else c.last_action_datetime end as last_action_datetime_? -- если по кейсу ничего не делали, то после\\xd0...\" is not valid UTF-8; skip. Failed metric descriptor: 'Desc{fqName: \"postgres_statements_query_info\", help: \"Labeled info about statements has been executed.\", constLabels: {instance=\"mydb\",service_id=\"postgres:5432\"}, variableLabels: [user database md5 query]}'"}
Jun  7 18:25:29 pgscv[27868]: 2021/06/07 18:25:29 http: panic serving 10.10.1.5:42816: runtime error: invalid memory address or nil pointer dereference
Jun  7 18:25:29 pgscv[27868]: goroutine 1124 [running]:
Jun  7 18:25:29 pgscv[27868]: net/http.(*conn).serve.func1(0xc0001225a0)
Jun  7 18:25:29 pgscv[27868]: #011/opt/hostedtoolcache/go/1.16.4/x64/src/net/http/server.go:1824 +0x153
Jun  7 18:25:29 pgscv[27868]: panic(0xb210a0, 0x108bde0)
Jun  7 18:25:29 pgscv[27868]: #011/opt/hostedtoolcache/go/1.16.4/x64/src/runtime/panic.go:971 +0x499
Jun  7 18:25:29 pgscv[27868]: github.com/prometheus/client_golang/prometheus.processMetric(0x0, 0x0, 0xc0003375f8, 0xc000337628, 0x0, 0x0, 0x1)
Jun  7 18:25:29 pgscv[27868]: #011/home/runner/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/registry.go:589 +0x3a
Jun  7 18:25:29 pgscv[27868]: github.com/prometheus/client_golang/prometheus.(*Registry).Gather(0xc000100910, 0x0, 0x0, 0x0, 0x0, 0x0)
Jun  7 18:25:29 pgscv[27868]: #011/home/runner/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/registry.go:490 +0x9da
Jun  7 18:25:29 pgscv[27868]: github.com/prometheus/client_golang/prometheus/promhttp.HandlerFor.func1(0x7f064cf9e680, 0xc001220410, 0xc000ff0100)
Jun  7 18:25:29 pgscv[27868]: #011/home/runner/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/promhttp/http.go:126 +0x99
Jun  7 18:25:29 pgscv[27868]: net/http.HandlerFunc.ServeHTTP(0xc00025ec40, 0x7f064cf9e680, 0xc001220410, 0xc000ff0100)
Jun  7 18:25:29 pgscv[27868]: #011/opt/hostedtoolcache/go/1.16.4/x64/src/net/http/server.go:2069 +0x44
Jun  7 18:25:29 pgscv[27868]: github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerInFlight.func1(0x7f064cf9e680, 0xc001220410, 0xc000ff0100)
Jun  7 18:25:29 pgscv[27868]: #011/home/runner/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/promhttp/instrument_server.go:40 +0xab
Jun  7 18:25:29 pgscv[27868]: net/http.HandlerFunc.ServeHTTP(0xc0002374a0, 0x7f064cf9e680, 0xc001220410, 0xc000ff0100)
Jun  7 18:25:29 pgscv[27868]: #011/opt/hostedtoolcache/go/1.16.4/x64/src/net/http/server.go:2069 +0x44
Jun  7 18:25:29 pgscv[27868]: github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerCounter.func1(0xcc1400, 0xc0002667e0, 0xc000ff0100)
Jun  7 18:25:29 pgscv[27868]: #011/home/runner/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/promhttp/instrument_server.go:100 +0xdf
Jun  7 18:25:29 pgscv[27868]: net/http.HandlerFunc.ServeHTTP(0xc000237590, 0xcc1400, 0xc0002667e0, 0xc000ff0100)
Jun  7 18:25:29 pgscv[27868]: #011/opt/hostedtoolcache/go/1.16.4/x64/src/net/http/server.go:2069 +0x44
Jun  7 18:25:29 pgscv[27868]: net/http.(*ServeMux).ServeHTTP(0x10a0000, 0xcc1400, 0xc0002667e0, 0xc000ff0100)
Jun  7 18:25:29 pgscv[27868]: #011/opt/hostedtoolcache/go/1.16.4/x64/src/net/http/server.go:2448 +0x1ad
Jun  7 18:25:29 pgscv[27868]: net/http.serverHandler.ServeHTTP(0xc000266000, 0xcc1400, 0xc0002667e0, 0xc000ff0100)
Jun  7 18:25:29 pgscv[27868]: #011/opt/hostedtoolcache/go/1.16.4/x64/src/net/http/server.go:2887 +0xa3
Jun  7 18:25:29 pgscv[27868]: net/http.(*conn).serve(0xc0001225a0, 0xcc3208, 0xc000a26580)
Jun  7 18:25:29 pgscv[27868]: #011/opt/hostedtoolcache/go/1.16.4/x64/src/net/http/server.go:1952 +0x8cd
Jun  7 18:25:29 pgscv[27868]: created by net/http.(*Server).Serve
Jun  7 18:25:29 pgscv[27868]: #011/opt/hostedtoolcache/go/1.16.4/x64/src/net/http/server.go:3013 +0x39b

Desktop (please complete the following information):

lesovsky commented 3 years ago

The problem is reproduced and occurs when invalid UTF-8 appear in label value.

The segfault is already fixed in master branch (here), as a part of implementation of another feature. Bugfix will be later (on this week I hope).

p.s. You can try to build pgscv from master and check the issue is still occur - you should see just an error (with no segfault) about failed metric.

lesovsky commented 3 years ago

For now, I've decided to keep current behavior - if an invalid UTF8 string found, just throw an error, skip metric and continue.

Performing strings validation for UTF8 (and discard invalid) will lead to the same behavior - throw error and skip metric. Also, I concerned about how pgSCV is working with non-UTF8 Postgres, this requires a bit more time for testing.

As a workaround for such cases, should reset pg_stat_statements and pass queries with valid UTF8 from application side.

lesovsky commented 3 years ago

@Tri0L There is 0.6.0-rc1 has been released. Please try to reproduce the bug.