influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.63k stars 5.58k forks source link

JSON_v2 parser error: fields or tags of one object may affect the contents or availability of another object #15892

Closed dngrudin closed 2 weeks ago

dngrudin commented 1 month ago

Relevant telegraf.conf

[[inputs.file]]
  files = ["./input.json"]
  data_format = "json_v2"

  [[inputs.file.json_v2]]
    [[inputs.file.json_v2.object]]
      path = "counters.thread"
      [[inputs.file.json_v2.object.tag]]
        path = "pools.#.name"
      [[inputs.file.json_v2.object.field]]
        path = "pools.#.active"

    [[inputs.file.json_v2.object]]
      path = "errors.type"
      [[inputs.file.json_v2.object.tag]]
        path = "error_list.#.name"
      [[inputs.file.json_v2.object.field]]
        path = "error_list.#.count"

[agent]
  interval = "2s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = ""
  hostname = ""
  omit_hostname = false

[[outputs.file]]
  files = [ "./metrics.prometheus"]
  data_format = "prometheus"

Logs from Telegraf

2024-09-16T09:11:17Z I! Loading config: ./telegraf.conf
2024-09-16T09:11:17Z I! Starting Telegraf 1.32.0-a0755797 brought to you by InfluxData the makers of InfluxDB
2024-09-16T09:11:17Z I! Available plugins: 235 inputs, 9 aggregators, 32 processors, 26 parsers, 62 outputs, 6 secret-stores
2024-09-16T09:11:17Z I! Loaded inputs: file
2024-09-16T09:11:17Z I! Loaded aggregators:
2024-09-16T09:11:17Z I! Loaded processors:
2024-09-16T09:11:17Z I! Loaded secretstores:
2024-09-16T09:11:17Z I! Loaded outputs: file
2024-09-16T09:11:17Z I! Tags enabled: host=vm.example.com
2024-09-16T09:11:17Z D! [agent] Initializing plugins
2024-09-16T09:11:17Z D! [agent] Connecting outputs
2024-09-16T09:11:17Z D! [agent] Attempting connection to [outputs.file]
2024-09-16T09:11:17Z D! [agent] Successfully connected to outputs.file
2024-09-16T09:11:17Z D! [agent] Starting service inputs
2024-09-16T09:11:17Z D! [agent] Stopping service inputs
2024-09-16T09:11:17Z D! [agent] Input channel closed
2024-09-16T09:11:17Z I! [agent] Hang on, flushing any cached metrics before shutdown
2024-09-16T09:11:17Z D! [outputs.file] Wrote batch of 1 metrics in 123.1µs
2024-09-16T09:11:17Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2024-09-16T09:11:17Z I! [agent] Stopping running outputs
2024-09-16T09:11:17Z D! [agent] Stopped Successfully

System info

Telegraf 1.32.0-a0755797

Docker

No response

Steps to reproduce

  1. Create a file input.json with the content:
    {
    "counters": {
    "thread": {
      "pools": [
        {
          "name": "main",
          "active": 1
        }
      ]
    }
    },
    "errors": {
    "type": {
      "total_errors": 3,
      "error_list": [
        {
          "name": "SomeError",
          "count": 3
        }
      ]
    }
    }
    }
  2. Starting telegraf ./telegraf --once --config ./telegraf.conf

Expected behavior

Contents of metrics.prometheus file:

# HELP file_pools_active Telegraf collected metric
# TYPE file_pools_active untyped
file_pools_active{host="vm.example.com",pools_name="main"} 1
# HELP file_error_list_count Telegraf collected metric
# TYPE file_error_list_count untyped
file_error_list_count{error_list_name="SomeError",host="vm.example.com"} 3

Actual behavior

Contents of metrics.prometheus file:

# HELP file_pools_active Telegraf collected metric
# TYPE file_pools_active untyped
file_pools_active{host="vm.example.com",pools_name="main"} 1

Additional info

The result of execution is affected by the offset within the parent element of the fields whose value must be found. With minimal change in input data, for example change the value from 3 to 30 the output result becomes correct. Modified input data:

{
  "counters": {
    "thread": {
      "pools": [
        {
          "name": "main",
          "active": 1
        }
      ]
    }
  },
  "errors": {
    "type": {
      "total_errors": 30,
      "error_list": [
        {
          "name": "SomeError",
          "count": 3
        }
      ]
    }
  }
}

Contents of metrics.prometheus file:

# HELP file_pools_active Telegraf collected metric
# TYPE file_pools_active untyped
file_pools_active{host="vm.example.com",pools_name="main"} 1
# HELP file_error_list_count Telegraf collected metric
# TYPE file_error_list_count untyped
file_error_list_count{error_list_name="SomeError",host="vm.example.com"} 3

Presumably the problem is that the subPathResults variable may contain the results of parsing several objects. Later, when calling the existsInpathResults function, the data of another object corresponding to the specified index may be returned. Although an empty result should have been returned.

In my opinion, a possible solution could be to reset the subPathResults variable inside the processObjects function at each iteration cycle of the objects variable. Code snippet:

func (p *Parser) processObjects(input []byte, objects []Object, timestamp time.Time) ([]telegraf.Metric, error) {
    p.iterateObjects = true
    var t []telegraf.Metric
    for _, c := range objects {
        p.subPathResults = nil
        p.objectConfig = c
srebhan commented 4 weeks ago

I suggest using the [xpath parser]() with

[[inputs.file]]
  files = ["test_configs/jsontest_issue_15892.json"]

  data_format = "xpath_json"
  xpath_native_types = true

  [[inputs.file.xpath]]
    metric_name = "'pools'"
    metric_selection = "/counters/thread/pools/*"
    [inputs.file.xpath.fields_int]
      active = "active"
    [inputs.file.xpath.tags]
      name = "name"

  [[inputs.file.xpath]]
    metric_name = "'errors'"
    metric_selection = "/errors/type/error_list/*"
    [inputs.file.xpath.fields_int]
      count = "count"
    [inputs.file.xpath.tags]
      name = "name"

  [[inputs.file.xpath]]
    metric_name = "'errors'"
    [inputs.file.xpath.fields_int]
      count = "/errors/type/total_errors"
    [inputs.file.xpath.tags]
      name = "'total'"

which results in the following metrics (in line-protocol) for your example

> pools,host=Hugin,name=main active=1i 1727880774000000000
> errors,host=Hugin,name=SomeError count=3i 1727880774000000000
> errors,host=Hugin,name=total count=3i 1727880774000000000
telegraf-tiger[bot] commented 2 weeks ago

Hello! I am closing this issue due to inactivity. I hope you were able to resolve your problem, if not please try posting this question in our Community Slack or Community Forums or provide additional details in this issue and reqeust that it be re-opened. Thank you!