ulranh / sapnwrfc_exporter

sapnwrfc_exporter - SAP NWRFC Exporter for Prometheus
Apache License 2.0
31 stars 6 forks source link

Program crushes with error "interface is nill" after a while #9

Closed ChernykhAN closed 3 years ago

ChernykhAN commented 3 years ago

There is the one table metric in config.

[[TableMetrics]]
  Name = "sap_lock_entries"
  Help = "sm12 help"
  MetricType = "gauge"
  TagFilter = []
  FunctionModule = "ENQUE_READ"
  AllServers = false
  Table = "ENQ"
  [TableMetrics.Params]
    GARG = ""
    GCLIENT = ""
    GNAME = ""
    GUNAME = ""
  [TableMetrics.RowCount]
    gclient = ["total", "000", "900"]
  [TableMetrics.RowFilter]

The metric value returns correct.

# TYPE sap_lock_entries gauge
sap_lock_entries{count="gclient_000",server="chq",system="chq",usage="test"} 2
sap_lock_entries{count="gclient_900",server="chq",system="chq",usage="test"} 13
sap_lock_entries{count="gclient_total",server="chq",system="chq",usage="test"} 15

But the program crashes with errors after a while.

sapnwrfc_exporter]$ panic: interface conversion: interface {} is nil, not []interface {}

goroutine 103 [running]:
github.com/ulranh/sapnwrfc_exporter/cmd.tableInfo.metricData(0xc000028fe0, 0x3, 0xc0001be600, 0xc0001be5d0, 0xc000202330, 0xc000028d90, 0x3, 0xc000028dc0, 0x4, 0xc0001bb500, ...)
        /usr/local/sap/sapnwrfc_exporter/cmd/exporter.go:270 +0xda7
github.com/ulranh/sapnwrfc_exporter/cmd.(*Config).getRfcData(0xc0001b8000, 0x0, 0x0, 0xc000028d90, 0x3, 0xc0002021e0, 0x1, 0x1, 0xc000262301)
        /usr/local/sap/sapnwrfc_exporter/cmd/exporter.go:262 +0x416
github.com/ulranh/sapnwrfc_exporter/cmd.(*Config).collectServersMetric.func1(0xc0003d4dd4, 0xc0005c2240, 0xc0001b8000, 0x0, 0x0, 0xc000028d90, 0x3, 0xc0002021e0)
        /usr/local/sap/sapnwrfc_exporter/cmd/exporter.go:215 +0x9e
created by github.com/ulranh/sapnwrfc_exporter/cmd.(*Config).collectServersMetric
        /usr/local/sap/sapnwrfc_exporter/cmd/exporter.go:213 +0x168
ChernykhAN commented 3 years ago

Additional null check in the string 270 has been resolved the issue.

if rawData[up(tMetric.Table)] == nil {
  log.WithFields(log.Fields{
    "system": system.Name,
    }).Error("Error table metric is null")
  return nil
}

But now I'm getting the next error after about 1-10 hours program runs.

fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x7f002c0000c0 pc=0x7fd26a90145b]

runtime stack:
runtime.throw(0x9559cb, 0x2a)
        /usr/local/go/src/runtime/panic.go:1116 +0x72
runtime.sigpanic()
        /usr/local/go/src/runtime/signal_unix.go:704 +0x4ac

goroutine 997 [syscall]:
runtime.cgocall(0x867150, 0xc000364ff0, 0xc000365020)
        /usr/local/go/src/runtime/cgocall.go:133 +0x5b fp=0xc000364fc0 sp=0xc000364f88 pc=0x4071db
github.com/sap/gorfc/gorfc._Cfunc_RfcDestroyFunction(0x7fd22c0022e0, 0x0, 0x0)
        _cgo_gotypes.go:341 +0x4d fp=0xc000364ff0 sp=0xc000364fc0 pc=0x829ead
github.com/sap/gorfc/gorfc.(*Connection).Call.func4.1()
        /home/chernykh/go/pkg/mod/github.com/sap/gorfc@v0.1.0/gorfc/gorfc.go:1074 +0x65 fp=0xc000365030 sp=0xc000364ff0 pc=0x83b6c5
github.com/sap/gorfc/gorfc.(*Connection).Call(0xc00020c120, 0xc000234360, 0xa, 0x8c4960, 0xc00020c9c0, 0xc00020c150, 0x0, 0x0)
        /home/chernykh/go/pkg/mod/github.com/sap/gorfc@v0.1.0/gorfc/gorfc.go:1117 +0x558 fp=0xc000367b38 sp=0xc000365030 pc=0x836eb8
github.com/ulranh/sapnwrfc_exporter/cmd.(*Config).getRfcData(0xc00021a000, 0x1, 0x0, 0xc000234008, 0x3, 0xc00020c120, 0xe9f2f0, 0xc000301780, 0x46aea5)
        /usr/local/sap/sapnwrfc_exporter/cmd/exporter.go:248 +0x317 fp=0xc000367f18 sp=0xc000367b38 pc=0x85ebf7
github.com/ulranh/sapnwrfc_exporter/cmd.(*Config).collectServersMetric.func1(0xc0001802e0, 0xc0000a8180, 0xc00021a000, 0x1, 0x0, 0xc000234008, 0x3, 0xc$
        /usr/local/sap/sapnwrfc_exporter/cmd/exporter.go:215 +0x9e fp=0xc000367fa0 sp=0xc000367f18 pc=0x865a1e
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc000367fa8 sp=0xc000367fa0 pc=0x46e8e1
created by github.com/ulranh/sapnwrfc_exporter/cmd.(*Config).collectServersMetric
        /usr/local/sap/sapnwrfc_exporter/cmd/exporter.go:213 +0x168

goroutine 1 [IO wait, 1 minutes]:
internal/poll.runtime_pollWait(0x7fd23956ff18, 0x72, 0x0)
        /usr/local/go/src/runtime/netpoll.go:220 +0x55
internal/poll.(*pollDesc).wait(0xc00018e318, 0x72, 0x0, 0x0, 0x944a15)
        /usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitRead(...)
        /usr/local/go/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Accept(0xc00018e300, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
        /usr/local/go/src/internal/poll/fd_unix.go:394 +0x1fc
net.(*netFD).accept(0xc00018e300, 0x20db99099def1385, 0x0, 0x0)
        /usr/local/go/src/net/fd_unix.go:172 +0x45
net.(*TCPListener).accept(0xc00000f000, 0x5fabbae2, 0xc00014f690, 0x4c9726)

.... very long stack

How to check and fix this error?

ulranh commented 3 years ago

We are running the same metric for months without any trouble. But the nil check makes definitely sense. I also did a few more small adaptions. Please check if something changed for you.

Otherwise a few questions:

ChernykhAN commented 3 years ago

go version go1.15.3 linux/amd64 There is a one test system in config toml. Test system has 2 app instances. avg_over_time(scrape_duration_seconds[12h]) = 0.02161

ChernykhAN commented 3 years ago

I compiled the new version but the error still exist.

fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x7f00240175c8 pc=0x7f7f7075b419]

runtime stack:
runtime.throw(0x957827, 0x2a)
        /usr/local/go/src/runtime/panic.go:1116 +0x72
runtime.sigpanic()
        /usr/local/go/src/runtime/signal_unix.go:704 +0x4ac

goroutine 624 [syscall]:
runtime.cgocall(0x869370, 0xc0002fa960, 0xc0001fe160)
        /usr/local/go/src/runtime/cgocall.go:133 +0x5b fp=0xc0002fa930 sp=0xc0002fa8f8 pc=0x4071db
github.com/sap/gorfc/gorfc._Cfunc_RfcOpenConnection(0xc0001fe160, 0xc00000000b, 0xc000459500, 0x0)
        _cgo_gotypes.go:885 +0x4e fp=0xc0002fa960 sp=0xc0002fa930 pc=0x82c28e
github.com/sap/gorfc/gorfc.(*Connection).Open.func1(0xc00040a510, 0xc000459500, 0x279c340)
        /home/chernykh/go/pkg/mod/github.com/sap/gorfc@v0.1.0/gorfc/gorfc.go:975 +0x9c fp=0xc0002fa9a0 sp=0xc0002fa960 pc=0x83b89c
github.com/sap/gorfc/gorfc.(*Connection).Open(0xc00040a510, 0x2, 0x7f7f240022b0)
        /home/chernykh/go/pkg/mod/github.com/sap/gorfc@v0.1.0/gorfc/gorfc.go:975 +0x65 fp=0xc0002fb798 sp=0xc0002fa9a0 pc=0x836725
github.com/sap/gorfc/gorfc.ConnectionFromParams(0xc00040a4e0, 0xc00040a4e0, 0x94733c, 0x9)
        /home/chernykh/go/pkg/mod/github.com/sap/gorfc@v0.1.0/gorfc/gorfc.go:927 +0x273 fp=0xc0002fb868 sp=0xc0002fb798 pc=0x836413
github.com/ulranh/sapnwrfc_exporter/cmd.connect(0xc000024ee0, 0x3, 0xc000024f10, 0x4, 0xc0001c3e30, 0x1, 0x1, 0xc000024f70, 0x8, 0xc000024f98, ...)
        /usr/local/sap/sapnwrfc_exporter/cmd/helper.go:13 +0x4fc fp=0xc0002fba20 sp=0xc0002fb868 pc=0x86301c
github.com/ulranh/sapnwrfc_exporter/cmd.(*Config).getSrvInfo(0xc0001be000, 0x0, 0x0, 0xc0001c3e30, 0x1, 0x1)
        /usr/local/sap/sapnwrfc_exporter/cmd/exporter.go:360 +0x128 fp=0xc0002fbf28 sp=0xc0002fba20 pc=0x8616c8
github.com/ulranh/sapnwrfc_exporter/cmd.(*Config).collectSystemsMetric.func1(0xc0001be000, 0x0, 0xc00005e180, 0x0)
        /usr/local/sap/sapnwrfc_exporter/cmd/exporter.go:173 +0xea fp=0xc0002fbfc0 sp=0xc0002fbf28 pc=0x8668ea
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc0002fbfc8 sp=0xc0002fbfc0 pc=0x46e8e1
created by github.com/ulranh/sapnwrfc_exporter/cmd.(*Config).collectSystemsMetric
        /usr/local/sap/sapnwrfc_exporter/cmd/exporter.go:168 +0x185

goroutine 1 [IO wait]:
internal/poll.runtime_pollWait(0x7f7f403cbf18, 0x72, 0x0)
        /usr/local/go/src/runtime/netpoll.go:220 +0x55
internal/poll.(*pollDesc).wait(0xc00017e998, 0x72, 0x0, 0x0, 0x946842)
        /usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitRead(...)
        /usr/local/go/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Accept(0xc00017e980, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
        /usr/local/go/src/internal/poll/fd_unix.go:394 +0x1fc
net.(*netFD).accept(0xc00017e980, 0x29f4198febfa916c, 0x0, 0x0)
        /usr/local/go/src/net/fd_unix.go:172 +0x45
net.(*TCPListener).accept(0xc0001d8160, 0x5fac061e, 0xc00014d690, 0x4c9726)
        /usr/local/go/src/net/tcpsock_posix.go:139 +0x32
net.(*TCPListener).Accept(0xc0001d8160, 0xc00014d6e0, 0x18, 0xc000000180, 0x6e3cec)
        /usr/local/go/src/net/tcpsock.go:261 +0x65
net/http.(*Server).Serve(0xc0001d60e0, 0x9db300, 0xc0001d8160, 0x0, 0x0)
        /usr/local/go/src/net/http/server.go:2937 +0x266
net/http.(*Server).ListenAndServe(0xc0001d60e0, 0x9456c3, 0x1)
        /usr/local/go/src/net/http/server.go:2866 +0xb7
github.com/ulranh/sapnwrfc_exporter/cmd.(*Config).web(0xc0001be000, 0xc000075f50, 0xc00014de98, 0xc00014de80)
        /usr/local/sap/sapnwrfc_exporter/cmd/exporter.go:114 +0x6ac
github.com/ulranh/sapnwrfc_exporter/cmd.Root()
        /usr/local/sap/sapnwrfc_exporter/cmd/root.go:195 +0x8a4
main.main()
        /usr/local/sap/sapnwrfc_exporter/main.go:24 +0x73

goroutine 643 [select]:
github.com/prometheus/client_golang/prometheus.(*Registry).Gather(0xc0000687d0, 0x0, 0x0, 0x0, 0x0, 0x0)
        /home/chernykh/go/pkg/mod/github.com/prometheus/client_golang@v1.8.0/prometheus/registry.go:511 +0xbbc
github.com/prometheus/client_golang/prometheus/promhttp.HandlerFor.func1(0x7f7f4038c938, 0xc0001f8140, 0xc0001a2100)
        /home/chernykh/go/pkg/mod/github.com/prometheus/client_golang@v1.8.0/prometheus/promhttp/http.go:126 +0x99
net/http.HandlerFunc.ServeHTTP(0xc0001a1dc0, 0x7f7f4038c938, 0xc0001f8140, 0xc0001a2100)
        /usr/local/go/src/net/http/server.go:2042 +0x44
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerInFlight.func1(0x7f7f4038c938, 0xc0001f8140, 0xc0001a2100)
        /home/chernykh/go/pkg/mod/github.com/prometheus/client_golang@v1.8.0/prometheus/promhttp/instrument_server.go:40 +0xab
net/http.HandlerFunc.ServeHTTP(0xc0001bd170, 0x7f7f4038c938, 0xc0001f8140, 0xc0001a2100)
        /usr/local/go/src/net/http/server.go:2042 +0x44
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerCounter.func1(0x9db580, 0xc00022e000, 0xc0001a2100)
        /home/chernykh/go/pkg/mod/github.com/prometheus/client_golang@v1.8.0/prometheus/promhttp/instrument_server.go:100 +0xda
net/http.HandlerFunc.ServeHTTP(0xc0001bd260, 0x9db580, 0xc00022e000, 0xc0001a2100)
        /usr/local/go/src/net/http/server.go:2042 +0x44
net/http.(*ServeMux).ServeHTTP(0xc0000216c0, 0x9db580, 0xc00022e000, 0xc0001a2100)
        /usr/local/go/src/net/http/server.go:2417 +0x1ad
net/http.serverHandler.ServeHTTP(0xc0001d60e0, 0x9db580, 0xc00022e000, 0xc0001a2100)
        /usr/local/go/src/net/http/server.go:2843 +0xa3
net/http.(*conn).serve(0xc00019e1e0, 0x9dc780, 0xc000020080)
        /usr/local/go/src/net/http/server.go:1925 +0x8ad
created by net/http.(*Server).Serve
        /usr/local/go/src/net/http/server.go:2969 +0x36c

goroutine 644 [IO wait]:
internal/poll.runtime_pollWait(0x7f7f403cbd58, 0x72, 0x9d3ba0)
        /usr/local/go/src/runtime/netpoll.go:220 +0x55
internal/poll.(*pollDesc).wait(0xc00017e098, 0x72, 0x9d3b00, 0xe5fb50, 0x0)
        /usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitRead(...)
        /usr/local/go/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0xc00017e080, 0xc00009a0d1, 0x1, 0x1, 0x0, 0x0, 0x0)
        /usr/local/go/src/internal/poll/fd_unix.go:159 +0x1a5
net.(*netFD).Read(0xc00017e080, 0xc00009a0d1, 0x1, 0x1, 0x3, 0xc00040a900, 0xc0004aff78)
        /usr/local/go/src/net/fd_posix.go:55 +0x4f
net.(*conn).Read(0xc0000ca000, 0xc00009a0d1, 0x1, 0x1, 0x0, 0x0, 0x0)
        /usr/local/go/src/net/net.go:182 +0x8e
net/http.(*connReader).backgroundRead(0xc00009a0c0)
        /usr/local/go/src/net/http/server.go:690 +0x58
created by net/http.(*connReader).startBackgroundRead
        /usr/local/go/src/net/http/server.go:686 +0xd5

goroutine 647 [chan receive]:
github.com/ulranh/sapnwrfc_exporter/cmd.(*Config).collectMetrics(0xc0001be000, 0xc00041cca0, 0x4114b4, 0xc00041cfb8)
        /usr/local/sap/sapnwrfc_exporter/cmd/exporter.go:152 +0x1b2
github.com/ulranh/sapnwrfc_exporter/cmd.(*Config).web.func1(0x9d7840, 0xc00040a090, 0x100000000203000)
        /usr/local/sap/sapnwrfc_exporter/cmd/exporter.go:96 +0x2a
github.com/ulranh/sapnwrfc_exporter/cmd.(*collector).Collect(0xc0001d2290, 0xc0002007e0)
        /usr/local/sap/sapnwrfc_exporter/cmd/exporter.go:49 +0x43
github.com/prometheus/client_golang/prometheus.(*Registry).Gather.func1()
        /home/chernykh/go/pkg/mod/github.com/prometheus/client_golang@v1.8.0/prometheus/registry.go:444 +0x1a2
created by github.com/prometheus/client_golang/prometheus.(*Registry).Gather
        /home/chernykh/go/pkg/mod/github.com/prometheus/client_golang@v1.8.0/prometheus/registry.go:536 +0xe8e

goroutine 648 [select]:
github.com/ulranh/sapnwrfc_exporter/cmd.(*Config).collectSystemsMetric(0xc0001be000, 0x0, 0x0, 0x0, 0x0)
        /usr/local/sap/sapnwrfc_exporter/cmd/exporter.go:188 +0x2c5
github.com/ulranh/sapnwrfc_exporter/cmd.(*Config).collectMetrics.func1(0xc000400040, 0xc000200e40, 0xc0001be000, 0x0)
        /usr/local/sap/sapnwrfc_exporter/cmd/exporter.go:141 +0x12d
created by github.com/ulranh/sapnwrfc_exporter/cmd.(*Config).collectMetrics
        /usr/local/sap/sapnwrfc_exporter/cmd/exporter.go:135 +0xfa

goroutine 646 [semacquire]:
sync.runtime_Semacquire(0xc000400038)
        /usr/local/go/src/runtime/sema.go:56 +0x45
sync.(*WaitGroup).Wait(0xc000400030)
        /usr/local/go/src/sync/waitgroup.go:130 +0x65
github.com/prometheus/client_golang/prometheus.(*Registry).Gather.func2(0xc000400030, 0xc0002007e0, 0xc000200840)
        /home/chernykh/go/pkg/mod/github.com/prometheus/client_golang@v1.8.0/prometheus/registry.go:461 +0x2b
created by github.com/prometheus/client_golang/prometheus.(*Registry).Gather
        /home/chernykh/go/pkg/mod/github.com/prometheus/client_golang@v1.8.0/prometheus/registry.go:460 +0x60d

goroutine 645 [runnable]:
strconv.ParseUint(0xc000128850, 0x1, 0xa, 0x40, 0x0, 0x0, 0x0)
        /usr/local/go/src/strconv/atoi.go:60 +0x725
github.com/prometheus/procfs.FS.Stat(0x945e22, 0x5, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /home/chernykh/go/pkg/mod/github.com/prometheus/procfs@v0.2.0/stat.go:196 +0xf65
github.com/prometheus/procfs.ProcStat.StartTime(0x44ca, 0xc0004000c0, 0xf, 0xe77350, 0x1, 0x1, 0x44c8, 0x44c8, 0x0, 0xffffffffffffffff, ...)
        /home/chernykh/go/pkg/mod/github.com/prometheus/procfs@v0.2.0/proc_stat.go:182 +0x50
github.com/prometheus/client_golang/prometheus.(*processCollector).processCollect(0xc000068820, 0xc0002007e0)
        /home/chernykh/go/pkg/mod/github.com/prometheus/client_golang@v1.8.0/prometheus/process_collector_other.go:44 +0xd39
github.com/prometheus/client_golang/prometheus.(*processCollector).Collect(0xc000068820, 0xc0002007e0)
        /home/chernykh/go/pkg/mod/github.com/prometheus/client_golang@v1.8.0/prometheus/process_collector.go:140 +0x33
github.com/prometheus/client_golang/prometheus.(*Registry).Gather.func1()
        /home/chernykh/go/pkg/mod/github.com/prometheus/client_golang@v1.8.0/prometheus/registry.go:444 +0x1a2
created by github.com/prometheus/client_golang/prometheus.(*Registry).Gather
        /home/chernykh/go/pkg/mod/github.com/prometheus/client_golang@v1.8.0/prometheus/registry.go:455 +0x5ce

goroutine 649 [semacquire]:
sync.runtime_Semacquire(0xc000400048)
        /usr/local/go/src/runtime/sema.go:56 +0x45
sync.(*WaitGroup).Wait(0xc000400040)
        /usr/local/go/src/sync/waitgroup.go:130 +0x65
github.com/ulranh/sapnwrfc_exporter/cmd.(*Config).collectMetrics.func2(0xc000400040, 0xc000200e40)
        /usr/local/sap/sapnwrfc_exporter/cmd/exporter.go:147 +0x2b
created by github.com/ulranh/sapnwrfc_exporter/cmd.(*Config).collectMetrics
        /usr/local/sap/sapnwrfc_exporter/cmd/exporter.go:146 +0x148
ulranh commented 3 years ago

Perhaps it is something regarding the gorfc library. Did you check, if all prerequisites are fulfilled - environment variables, current version of the nwrfcsdk?

ChernykhAN commented 3 years ago

It's a bug in SAP NW RFC SDK PL 7. In PL8 should be fixed. https://github.com/SAP/gorfc/issues/28#issuecomment-732819870