powerapi-ng / hwpc-sensor

Hardware Performance Counters monitoring agent for containers.
BSD 3-Clause "New" or "Revised" License
14 stars 16 forks source link

problem sensor with socket #15

Closed lastrecrue closed 2 years ago

lastrecrue commented 2 years ago

Hi, When i run a sensor with this configuration file i don't have error but it doesn't work work.

My config with socket :

{
  "name": "sensor",
  "verbose": true,
  "frequency": 500,
  "output": {
    "type": "socket",
        "uri": "localhost",
        "port": 8080
  },
  "system": {
    "rapl": {
      "events": ["RAPL_ENERGY_PKG"],
      "monitoring_type": "MONITOR_ONE_CPU_PER_SOCKET"
    },
    "msr": {
      "events": ["TSC", "APERF", "MPERF"]
    }
  },
  "container": {
    "core": {
      "events": [
        "CPU_CLK_THREAD_UNHALTED:REF_P",
        "CPU_CLK_THREAD_UNHALTED:THREAD_P",
        "LLC_MISSES",
        "INSTRUCTIONS_RETIRED"
      ]
    }
  }
}

my run command:

docker run --rm --net=host --privileged --pid=host -v /sys:/sys -v /var/lib/docker/containers:/var/lib/docker/containers:ro -v /tmp/powerapi-sensor-reporting:/reporting -v $(pwd):/srv -v $(pwd)/config_file.json:/config_file.json powerapi/hwpc-sensor --config-file config_file.json

Log without confirmation sensor :

I: 21-12-10 14:07:29 build: version v1.1.0 (rev: 7d735067c2dfbb82ba0cb4afea3ed4dc1331919d) (Dec  7 2021 - 17:23:33)
I: 21-12-10 14:07:29 uname: Linux 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 15:58:17 UTC 2021 x86_64
I: 21-12-10 14:07:29 pmu: found ix86arch 'Intel X86 architectural PMU' having 7 events, 7 counters (4 general, 3 fixed)
I: 21-12-10 14:07:29 pmu: found perf 'perf_events generic PMU' having 194 events, 0 counters (0 general, 0 fixed)
I: 21-12-10 14:07:29 pmu: found hsw 'Intel Haswell' having 74 events, 11 counters (8 general, 3 fixed)
I: 21-12-10 14:07:29 pmu: found rapl 'Intel RAPL' having 3 events, 3 counters (0 general, 3 fixed)
I: 21-12-10 14:07:29 pmu: found perf_raw 'perf_events raw PMU' having 1 events, 0 counters (0 general, 0 fixed)
I: 21-12-10 14:07:29 pmu: found intel_msr 'Intel MSR' having 6 events, 6 counters (0 general, 6 fixed)

When i run a command with mongo configuration :

{
  "name": "sensor",
  "verbose": true,
  "frequency": 500,
  "output": {
    "type": "mongodb",
        "uri": "mongodb://127.0.0.1",
        "database": "db_sensor",
        "collection": "report_0"
  },
  "system": {
    "rapl": {
      "events": ["RAPL_ENERGY_PKG"],
      "monitoring_type": "MONITOR_ONE_CPU_PER_SOCKET"
    },
    "msr": {
      "events": ["TSC", "APERF", "MPERF"]
    }
  },
  "container": {
    "core": {
      "events": [
        "CPU_CLK_THREAD_UNHALTED:REF_P",
        "CPU_CLK_THREAD_UNHALTED:THREAD_P",
        "LLC_MISSES",
        "INSTRUCTIONS_RETIRED"
      ]
    }
  }
}

I have this additional log

: 21-12-10 14:01:20 sensor: configuration is valid, starting monitoring...
I: 21-12-10 14:01:20 perf<all>: monitoring actor started
I: 21-12-10 14:01:20 perf<mongo>: monitoring actor started

Cdlt Achraf

PierreRust commented 2 years ago

I suspect the issue here is that the sensor does not succeed in connecting to your mongodb instance. When the sensor cannot connect to its configured output, it often simply blocks and wait until the output is available.

The problem comes form the uri: localhost in your sensor's configuration : the sensor does not resolve network names and you must currently use a IP adress here, see issue #14 .