intelsdi-x / snap-plugin-collector-pcm

Collects Intel Performance Counter Metrics (PCM)
http://snap-telemetry.io/
Apache License 2.0
7 stars 20 forks source link

PCM Plugin does not work with current PCM version #31

Closed skonefal closed 7 years ago

skonefal commented 7 years ago

Snap daemon version (use snapteld -v): 1.0.0

Environment:

What happened: Failed to load plugin. See logs attached.

INFO[2017-01-12T15:17:00+01:00] Loading plugin: /tmp/092515260/snap-plugin-collector-pcm  _module=_mgmt-rest
INFO[2017-01-12T15:17:00+01:00] plugin load called                            _block=load _module=control
INFO[2017-01-12T15:17:00+01:00] plugin load called                            _block=load-plugin _module=control-plugin-mgr path=snap-plugin-collector-pcm
DEBU[2017-01-12T15:17:00+01:00] plugin load timeout set to 3s                 _block=load-plugin _module=control-plugin-mgr path=[snap-plugin-collector-pcm]
DEBU[2017-01-12T15:17:02+01:00] panic: Timed out waiting for metrics from pcm  _module=plugin-exec io=stderr plugin=
DEBU[2017-01-12T15:17:02+01:00]                                               _module=plugin-exec io=stderr plugin=
DEBU[2017-01-12T15:17:02+01:00] goroutine 1 [running]:                        _module=plugin-exec io=stderr plugin=
DEBU[2017-01-12T15:17:02+01:00] panic(0x744dc0, 0xc4200142d0)                 _module=plugin-exec io=stderr plugin=
DEBU[2017-01-12T15:17:02+01:00]         /usr/local/go/src/runtime/panic.go:500 +0x1a1  _module=plugin-exec io=stderr plugin=
DEBU[2017-01-12T15:17:02+01:00] main.main()                                   _module=plugin-exec io=stderr plugin=
DEBU[2017-01-12T15:17:02+01:00]         /home/skonefal/dev/go/src/github.com/intelsdi-x/snap-plugin-collector-pcm/main.go:33 +0xc3  _module=plugin-exec io=stderr plugin=
ERRO[2017-01-12T15:17:03+01:00] load plugin error when starting plugin        _block=load-plugin _module=control-plugin-mgr error=timed out waiting for plugin snap-plugin-collector-pcm
ERRO[2017-01-12T15:17:03+01:00] timed out waiting for plugin snap-plugin-collector-pcm  _module=_mgmt-rest
DEBU[2017-01-12T15:17:03+01:00] Removing file (/tmp/092515260/snap-plugin-collector-pcm)  _module=_mgmt-rest
DEBU[2017-01-12T15:17:03+01:00] API response                                  _module=_mgmt-rest index=15 method=POST status=Internal Server Error status-code=500 url=/v1/plugins

What you expected to happen: Plugin load successfully.

Steps to reproduce it (as minimally and precisely as possible):

  1. Build PCM binary from https://github.com/opcm/pcm/tree/8840be4128d4bf6113c1a42bd12e7b1c06a8d2f6 and add pcm.x to path
  2. Build PCM plugin https://github.com/intelsdi-x/snap-plugin-collector-pcm/tree/89bc06ed753dd7b20c02e2ecfc51e846b24aa38e
  3. Run snapd 1.0.0
  4. `snaptel plugin load

Anything else do we need to know (e.g. issue happens only occasionally):

nanliu commented 7 years ago

Two quick sanity check:

skonefal commented 7 years ago

@nanliu right, my fault was not reading readme and not disabling NMI watchdog. The plugin loads correctly. Still, it might be helpful to print last few lines of pcm.x binary combined output on snaptel log stream when something goes wrong to make debugging easier :)

Thank you for your support, we can close this case :)

nanliu commented 7 years ago

This is related to: https://github.com/intelsdi-x/snap/issues/1466. We will work on improving plugin load error messages, so they are less opaque for debugging. Thanks again for submitting this issue.