influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.9k stars 5.6k forks source link

Unable to read Parse Valid Json #2301

Closed aschuhl closed 7 years ago

aschuhl commented 7 years ago

I am using the example from json from the readme on data formats: https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md [ { "a": 5, "b": { "c": 6 }, "ignored": "I'm a string" } ]

Here is my config file using the tail input:

# Telegraf Configuration # # Telegraf is entirely plugin driven. All metrics are gathered from the # declared inputs, and sent to the declared outputs. # # Plugins must be declared in here to be active. # To deactivate a plugin, comment out the name and any variables. # # Use 'telegraf -config telegraf.conf -test' to see what metrics a config # file would generate. # # Environment variables can be used anywhere in this config file, simply prepend # them with $. For strings the variable must be within quotes (ie, "$STR_VAR"), # for numbers and booleans they should be plain (ie, $INT_VAR, $BOOL_VAR)

# Global tags can be specified here in key="value" format. [global_tags] # dc = "us-east-1" # will tag all metrics with dc=us-east-1 # rack = "1a" ## Environment variables can be used as tags, and throughout the config file # user = "$USER"

# Configuration for telegraf agent [agent] ## Default data collection interval for all inputs interval = "10s" ## Rounds collection interval to 'interval' ## ie, if interval="10s" then always collect on :00, :10, :20, etc. round_interval = true

## Telegraf will send metrics to outputs in batches of at most ## metric_batch_size metrics. ## This controls the size of writes that Telegraf sends to output plugins. metric_batch_size = 1000

## For failed writes, telegraf will cache metric_buffer_limit metrics for each ## output, and will flush this buffer on a successful write. Oldest metrics ## are dropped first when this buffer fills. ## This buffer only fills when writes fail to output plugin(s). metric_buffer_limit = 10000

## Collection jitter is used to jitter the collection by a random amount. ## Each plugin will sleep for a random time within jitter before collecting. ## This can be used to avoid many plugins querying things like sysfs at the ## same time, which can have a measurable effect on the system. collection_jitter = "0s"

## Default flushing interval for all outputs. You shouldn't set this below ## interval. Maximum flush_interval will be flush_interval + flush_jitter flush_interval = "10s" ## Jitter the flush interval by a random amount. This is primarily to avoid ## large write spikes for users running a large number of telegraf instances. ## ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s flush_jitter = "0s"

## By default, precision will be set to the same timestamp order as the ## collection interval, with the maximum being 1s. ## Precision will NOT be used for service inputs, such as logparser and statsd. ## Valid values are "ns", "us" (or "µs"), "ms", "s". precision = ""

## Logging configuration: ## Run telegraf with debug log messages. debug = false ## Run telegraf in quiet mode (error log messages only). quiet = false ## Specify the log file name. The empty string means to log to stderr. logfile = ""

## Override default hostname, if empty use os.Hostname() hostname = "" ## If set to true, do no set the "host" tag in the telegraf agent. omit_hostname = false

############################################################################### # OUTPUT PLUGINS # ###############################################################################

# Send telegraf metrics to file(s) [[outputs.file]] ## Files to write to, "stdout" is a specially handled file. files = ["stdout", "metrics.out"]

## Data format to output. ## Each data format has it's own unique set of configuration options, read ## more about them here: ## https:##github.com#influxdata#telegraf#blob#master#docs#DATA_FORMATS_OUTPUT.md data_format = "json"

############################################################################### # PROCESSOR PLUGINS # ###############################################################################

# # Print all metrics that pass through this filter. # [[processors.printer]]

############################################################################### # AGGREGATOR PLUGINS # ###############################################################################

# # Keep the aggregate min#max of each metric passing through. # [[aggregators.minmax]] # ## General Aggregator Arguments: # ## The period on which to flush & clear the aggregator. # period = "30s" # ## If true, the original metric will be dropped by the # ## aggregator and will not get sent to the output plugins. # drop_original = false

############################################################################### # INPUT PLUGINS # ###############################################################################

############################################################################### # SERVICE INPUT PLUGINS # ###############################################################################

# Stream a log file, like the tail -f command [[inputs.tail]] ## files to tail. ## These accept standard unix glob matching rules, but with the addition of ## as a "super asterisk". ie: ## "#var#log#.log" -> recursively find all .log files in #var#log ## "#var#log##.log" -> find all .log files with a parent dir in #var#log ## "#var#log#apache.log" -> just tail the apache log file ## ## See https:##github.com#gobwas#glob for more examples ## files = ["test.json"] ## Read file from beginning. from_beginning = true

## Data format to consume. ## Each data format has it's own unique set of configuration options, read ## more about them here: ## https:##github.com#influxdata#telegraf#blob#master#docs#DATA_FORMATS_INPUT.md data_format = "json" ##tag_keys = [ ##]

Running ./telegraf.exe -config telegraf.conf

Expected output:

tail, a=5,b_c=6

Actual output:

2017/01/20 15:39:00 I! Starting Telegraf (version 1.1.2) 2017/01/20 15:39:00 I! Loaded outputs: file 2017/01/20 15:39:00 I! Loaded inputs: inputs.tail 2017/01/20 15:39:00 I! Tags enabled: host=patapl-BLMRQ72 2017/01/20 15:39:00 I! Agent Config: Interval:10s, Quiet:false, Hostname:"patapl-BLMRQ72", Flush Interval:10s 2017/01/20 15:39:00 Seeked test.json - &{Offset:0 Whence:0} 2017/01/20 15:39:00 E! Malformed log line in test.json: [[], Error: unable to parse out as JSON, unexpected end of JSON input 2017/01/20 15:39:00 E! Malformed log line in test.json: [ {], Error: unable to parse out as JSON, unexpected end of JSON input 2017/01/20 15:39:00 E! Malformed log line in test.json: [ "a": 5,], Error: unable to parse out as JSON, invalid character ':' after top-level value 2017/01/20 15:39:00 E! Malformed log line in test.json: [ "b": {], Error: unable to parse out as JSON, invalid character ':' after top-level value 2017/01/20 15:39:00 E! Malformed log line in test.json: [ "c": 6], Error: unable to parse out as JSON, invalid character ':' after top-level value 2017/01/20 15:39:00 E! Malformed log line in test.json: [ },], Error: unable to parse out as JSON, invalid character '}' looking for beginning of value 2017/01/20 15:39:00 E! Malformed log line in test.json: [ "ignored": "I'm a string"], Error: unable to parse out as JSON, invalid character ':' after top-level value 2017/01/20 15:39:00 E! Malformed log line in test.json: [ }], Error: unable to parse out as JSON, invalid character '}' looking for beginning of value

It is valid json though so I'm not sure why it can't parse it.

sparrc commented 7 years ago

JSON arrays are not supported until version 1.2, sorry for the confusion

aschuhl commented 7 years ago

@sparrc Okay so then, are the other data types supported? Is there a link where I can verify which types work? I thought the README said any of the influx data types would work so I just want to be clear moving forward. Thanks for your help.

sparrc commented 7 years ago

I think you also need to put your json on a single line, the readme is just showing it that way for human readability.

If you want to put a note about that in a PR that would be appreciated.