mozilla-services / heka

DEPRECATED: Data collection and processing made easy.
http://hekad.readthedocs.org/
Other
3.39k stars 531 forks source link

InfluxDB line encoder doesn't work with CPU stats because it contains array fields #1878

Open McStork opened 8 years ago

McStork commented 8 years ago

Hello, I'm trying to use Heka with InfluxDB for monitoring CPU and disk ressources. It works for the disk stats but fails to capture the CPU metrics, and it does so without outputing any debug error.

What works:

[hekad]
maxprocs = 2

[DiskStatsDecoder]
type = "SandboxDecoder"
filename = "lua_decoders/linux_diskstats.lua"

[DiskStats]
type = "FilePollingInput"
ticker_interval = %ENV[TICKER_INTERVAL]
file_path = "/sys/block/sdb/stat"
decoder = "DiskStatsDecoder"

[InfluxdbLineEncoder]
type = "SandboxEncoder"
filename = "lua_encoders/schema_influx_line.lua"

    [InfluxdbLineEncoder.config]
    skip_fields = "**all_base** FilePath NumProcesses Environment TickerInterval"
    tag_fields = "Hostname"
    timestamp_precision= "s"

[InfluxdbOutput]
type = "HttpOutput"
message_matcher = "Type =~ /stats.*/"
encoder = "InfluxdbLineEncoder"
address = "%ENV[INFLUX_DB_HOST]"
http_timeout = %ENV[TICKER_INTERVAL]

What fails:

[hekad]
maxprocs = 2

[ProcStatDecoder]
type = "SandboxDecoder"
filename = "lua_decoders/linux_procstat.lua"

[ProcStats]
type = "FilePollingInput"
ticker_interval = %ENV[TICKER_INTERVAL]
file_path = "/proc/stat"
decoder = "ProcStatDecoder"

[InfluxdbLineEncoder]
type = "SandboxEncoder"
filename = "lua_encoders/schema_influx_line.lua"

    [InfluxdbLineEncoder.config]
    skip_fields = "**all_base** FilePath NumProcesses Environment TickerInterval"
    tag_fields = "Hostname"
    timestamp_precision= "s"

[InfluxdbOutput]
type = "HttpOutput"
message_matcher = "Type =~ /stats.*/"
encoder = "InfluxdbLineEncoder"
address = "%ENV[INFLUX_DB_HOST]"
http_timeout = %ENV[TICKER_INTERVAL]

Also, changing the output of the first configuration to LogOutput doesn't output any data: Config:

[hekad]
maxprocs = 2 

[DiskStats]
type = "FilePollingInput"
ticker_interval = 1 
file_path = "/sys/block/sdb/stat"
decoder = "DiskStatsDecoder"

[DiskStatsDecoder]
type = "SandboxDecoder"
filename = "lua_decoders/linux_diskstats.lua"

[PayloadEncoder]
append_newlines = true

[LogOutput]
message_matcher = "Type =~ /stats.*/"
encoder = "PayloadEncoder"

Run:

2016/03/08 13:07:14 Pre-loading: [DiskStats]
2016/03/08 13:07:14 Pre-loading: [DiskStatsDecoder]
2016/03/08 13:07:14 Pre-loading: [PayloadEncoder]
2016/03/08 13:07:14 Pre-loading: [LogOutput]
2016/03/08 13:07:14 Pre-loading: [ProtobufDecoder]
2016/03/08 13:07:14 Loading: [ProtobufDecoder]
2016/03/08 13:07:14 Pre-loading: [ProtobufEncoder]
2016/03/08 13:07:14 Loading: [ProtobufEncoder]
2016/03/08 13:07:14 Pre-loading: [TokenSplitter]
2016/03/08 13:07:14 Loading: [TokenSplitter]
2016/03/08 13:07:14 Pre-loading: [HekaFramingSplitter]
2016/03/08 13:07:14 Loading: [HekaFramingSplitter]
2016/03/08 13:07:14 Pre-loading: [NullSplitter]
2016/03/08 13:07:14 Loading: [NullSplitter]
2016/03/08 13:07:14 Loading: [DiskStatsDecoder]
2016/03/08 13:07:14 Loading: [PayloadEncoder]
2016/03/08 13:07:14 Loading: [DiskStats]
2016/03/08 13:07:14 Loading: [LogOutput]
2016/03/08 13:07:14 Starting hekad...
2016/03/08 13:07:14 Output started: LogOutput
2016/03/08 13:07:14 MessageRouter started.
2016/03/08 13:07:14 Input started: DiskStats
2016/03/08 13:07:15 
2016/03/08 13:07:16 
2016/03/08 13:07:17 
2016/03/08 13:07:18 
2016/03/08 13:07:19

I'm new to Heka so I may be missing something obvious but I checked multiple times and I can't see where the issue could come from.

simonpasquier commented 8 years ago

For the record, it is more appropriate to use the Heka mailing list or the #heka IRC channel.

As for testing, I would advise that you use the RstEncoder encoder instead of PayloadEncoder and you should see the messages injected by the ProcStats plugin. Here a sample output that I tested quickly:

2016/03/08 14:37:43 
:Timestamp: 2016-03-08 13:37:43 +0000 UTC
:Type: stats.procstat
:Hostname: simon-trusty
:Pid: 0
:Uuid: 35da0116-fd74-4423-ad43-523de6482380
:Logger: ProcStats
:Payload: 
:EnvVersion: 
:Severity: 7
:Fields:
    | name:"intr" type:double value:[2.470343249e+09,50,195826,0,0,0,0,0,0,0,47,0,0,17616,0,0,193085,105085,112168,0,2.330477e+06,2.364686341e+09,1.073452e+06,146,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
    | name:"procs_blocked" type:double value:1
    | name:"softirq" type:double value:[8.0686994e+07,19,3.2565104e+07,88522,4.822638e+06,994119,0,171904,3.0934711e+07,0,1.1109977e+07]
    | name:"btime" type:double value:1.457251235e+09
    | name:"procs_running" type:double value:2
    | name:"ctxt" type:double value:4.70790817e+09
    | name:"cpu0" type:double value:[263231,4612,3.889563e+06,8.385486e+06,6612,0,11452,0,0,0]
    | name:"cpu" type:double value:[920220,14333,4.295187e+06,2.613972e+07,27892,0,32557,0,0,0]
    | name:"cpu1" type:double value:[656988,9721,405623,1.7754234e+07,21279,0,21104,0,0,0]
    | name:"processes" type:double value:211224
McStork commented 8 years ago

@simonpasquier Thanks for your answer

I tried the RstEncoder in LogOutput, works well. Then I tried InfluxdbLineEncoder in LogOutput and it seems to explain why InfluxDB does not receive any data:

2016/03/08 14:49:20 Output started: LogOutput
2016/03/08 14:49:20 MessageRouter started.
2016/03/08 14:49:20 Input started: ProcStats
2016/03/08 14:49:21 Plugin 'LogOutput' error: Error encoding message: FATAL: process_message() not enough memory
2016/03/08 14:49:22 Plugin 'LogOutput' error: Error encoding message: FATAL: process_message() not enough memory
2016/03/08 14:49:23 Plugin 'LogOutput' error: Error encoding message: FATAL: process_message() not enough memory
2016/03/08 14:49:24 Plugin 'LogOutput' error: Error encoding message: FATAL: process_message() not enough memory
McStork commented 8 years ago

It is resolved by putting "intr" in skip_fields of InfluxdbLineEncoder.config. So IMO there could be two issues:

simonpasquier commented 8 years ago

If you increase the output_limit parameter [1] to some insane value (eg 1MiB), you'll get the encoded message. IIUC the InfluxDB encoder doesn't work well with fields holding array values.

<snip>
intr_vidx_1_vidx_2_vidx_3_vidx_4_vidx_5_vidx_6_vidx_7_vidx_8_vidx_9_vidx_10_vidx_11_vidx_12_vidx_13_vidx_14_vidx_15_vidx_16_vidx_17_vidx_18_vidx_19_vidx_20_vidx_21_vidx_22_vidx_23_vidx_24_vidx_25_vidx_26_vidx_27_vidx_28_vidx_29_vid
x_30_vidx_31_vidx_32_vidx_33_vidx_34_vidx_35_vidx_36_vidx_37_vidx_38_vidx_39_vidx_40_vidx_41_vidx_42_vidx_43_vidx_44_vidx_45_vidx_46_vidx_47_vidx_48_vidx_49_vidx_50_vidx_51_vidx_52_vidx_53_vidx_54_vidx_55_vidx_56_vidx_57_vidx_58_vi
dx_59_vidx_60_vidx_61_vidx_62_vidx_63_vidx_64_vidx_65_vidx_66_vidx_67_vidx_68_vidx_69_vidx_70_vidx_71_vidx_72_vidx_73_vidx_74_vidx_75_vidx_76_vidx_77_vidx_78_vidx_79_vidx_80_vidx_81_vidx_82_vidx_83_vidx_84_vidx_85_vidx_86_vidx_87_v
idx_88_vidx_89_vidx_90_vidx_91_vidx_92_vidx_93_vidx_94_vidx_95_vidx_96_vidx_97_vidx_98_vidx_99_vidx_100_vidx_101_vidx_102_vidx_103_vidx_104_vidx_105_vidx_106_vidx_107_vidx_108_vidx_109_vidx_110_vidx_111_vidx_112_vidx_113_vidx_114_v
idx_115_vidx_116_vidx_117_vidx_118_vidx_119_vidx_120_vidx_121_vidx_122_vidx_123_vidx_124_vidx_125_vidx_126_vidx_127_vidx_128_vidx_129_vidx_130_vidx_131_vidx_132,Hostname=simon-trusty value=0.000000 1457450505
intr_vidx_1_vidx_2_vidx_3_vidx_4_vidx_5_vidx_6_vidx_7_vidx_8_vidx_9_vidx_10_vidx_11_vidx_12_vidx_13_vidx_14_vidx_15_vidx_16_vidx_17_vidx_18_vidx_19_vidx_20_vidx_21_vidx_22_vidx_23_vidx_24_vidx_25_vidx_26_vidx_27_vidx_28_vidx_29_vid
x_30_vidx_31_vidx_32_vidx_33_vidx_34_vidx_35_vidx_36_vidx_37_vidx_38_vidx_39_vidx_40_vidx_41_vidx_42_vidx_43_vidx_44_vidx_45_vidx_46_vidx_47_vidx_48_vidx_49_vidx_50_vidx_51_vidx_52_vidx_53_vidx_54_vidx_55_vidx_56_vidx_57_vidx_58_vi
dx_59_vidx_60_vidx_61_vidx_62_vidx_63_vidx_64_vidx_65_vidx_66_vidx_67_vidx_68_vidx_69_vidx_70_vidx_71_vidx_72_vidx_73_vidx_74_vidx_75_vidx_76_vidx_77_vidx_78_vidx_79_vidx_80_vidx_81_vidx_82_vidx_83_vidx_84_vidx_85_vidx_86_vidx_87_v
idx_88_vidx_89_vidx_90_vidx_91_vidx_92_vidx_93_vidx_94_vidx_95_vidx_96_vidx_97_vidx_98_vidx_99_vidx_100_vidx_101_vidx_102_vidx_103_vidx_104_vidx_105_vidx_106_vidx_107_vidx_108_vidx_109_vidx_110_vidx_111_vidx_112_vidx_113_vidx_114_v
idx_115_vidx_116_vidx_117_vidx_118_vidx_119_vidx_120_vidx_121_vidx_122_vidx_123_vidx_124_vidx_125_vidx_126_vidx_127_vidx_128_vidx_129_vidx_130_vidx_131_vidx_132_vidx_133_vidx_134_vidx_135_vidx_136_vidx_137_vidx_138_vidx_139_vidx_14
0_vidx_141_vidx_142_vidx_143_vidx_144_vidx_145_vidx_146_vidx_147_vidx_148_vidx_149_vidx_150,Hostname=simon-trusty value=0.000000 1457450505
intr_vidx_1_vidx_2_vidx_3_vidx_4_vidx_5_vidx_6_vidx_7_vidx_8_vidx_9_vidx_10_vidx_11_vidx_12_vidx_13_vidx_14_vidx_15_vidx_16_vidx_17_vidx_18_vidx_19_vidx_20_vidx_21_vidx_22_vidx_23_vidx_24_vidx_25_vidx_26_vidx_27_vidx_28_vidx_29_vid
x_30_vidx_31_vidx_32_vidx_33_vidx_34_vidx_35_vidx_36_vidx_37_vidx_38_vidx_39_vidx_40_vidx_41_vidx_42_vidx_43_vidx_44_vidx_45_vidx_46_vidx_47_vidx_48_vidx_49_vidx_50_vidx_51_vidx_52_vidx_53_vidx_54_vidx_55_vidx_56_vidx_57_vidx_58_vi
dx_59_vidx_60_vidx_61_vidx_62_vidx_63_vidx_64_vidx_65_vidx_66_vidx_67_vidx_68_vidx_69_vidx_70_vidx_71_vidx_72_vidx_73_vidx_74_vidx_75_vidx_76_vidx_77_vidx_78_vidx_79_vidx_80_vidx_81_vidx_82_vidx_83_vidx_84_vidx_85_vidx_86_vidx_87_v
idx_88_vidx_89_vidx_90_vidx_91_vidx_92_vidx_93_vidx_94_vidx_95_vidx_96_vidx_97_vidx_98_vidx_99_vidx_100_vidx_101_vidx_102_vidx_103_vidx_104_vidx_105_vidx_106_vidx_107_vidx_108_vidx_109_vidx_110_vidx_111_vidx_112_vidx_113_vidx_114_v
idx_115_vidx_116_vidx_117_vidx_118_vidx_119_vidx_120_vidx_121_vidx_122_vidx_123_vidx_124_vidx_125_vidx_126_vidx_127_vidx_128_vidx_129_vidx_130_vidx_131_vidx_132_vidx_133_vidx_134_vidx_135_vidx_136_vidx_137_vidx_138_vidx_139_vidx_14
0_vidx_141_vidx_142_vidx_143_vidx_144_vidx_145_vidx_146_vidx_147_vidx_148_vidx_149_vidx_150_vidx_151_vidx_152_vidx_153_vidx_154_vidx_155_vidx_156_vidx_157_vidx_158_vidx_159_vidx_160_vidx_161_vidx_162_vidx_163_vidx_164_vidx_165_vidx
_166_vidx_167_vidx_168_vidx_169_vidx_170_vidx_171_vidx_172_vidx_173_vidx_174_vidx_175_vidx_176_vidx_177_vidx_178_vidx_179_vidx_180_vidx_181_vidx_182_vidx_183_vidx_184_vidx_185_vidx_186_vidx_187_vidx_188_vidx_189_vidx_190_vidx_191_v
idx_192_vidx_193_vidx_194_vidx_195_vidx_196_vidx_197_vidx_198_vidx_199_vidx_200_vidx_201_vidx_202_vidx_203_vidx_204_vidx_205_vidx_206_vidx_207_vidx_208_vidx_209_vidx_210_vidx_211_vidx_212_vidx_213_vidx_214_vidx_215_vidx_216_vidx_21
7_vidx_218_vidx_219_vidx_220_vidx_221_vidx_222_vidx_223_vidx_224_vidx_225_vidx_226_vidx_227_vidx_228_vidx_229_vidx_230_vidx_231_vidx_232_vidx_233_vidx_234_vidx_235_vidx_236_vidx_237_vidx_238_vidx_239_vidx_240_vidx_241_vidx_242_vidx
_243_vidx_244_vidx_245_vidx_246_vidx_247_vidx_248_vidx_249_vidx_250_vidx_251_vidx_252_vidx_253_vidx_254_vidx_255_vidx_256_vidx_257_vidx_258_vidx_259_vidx_260_vidx_261_vidx_262_vidx_263_vidx_264_vidx_265_vidx_266_vidx_267_vidx_268_v
idx_269_vidx_270_vidx_271_vidx_272_vidx_273_vidx_274_vidx_275_vidx_276_vidx_277_vidx_278_vidx_279_vidx_280_vidx_281_vidx_282_vidx_283_vidx_284_vidx_285_vidx_286_vidx_287_vidx_288_vidx_289_vidx_290_vidx_291_vidx_292_vidx_293_vidx_29
4_vidx_295_vidx_296_vidx_297_vidx_298_vidx_299_vidx_300_vidx_301_vidx_302_vidx_303_vidx_304_vidx_305_vidx_306_vidx_307_vidx_308_vidx_309_vidx_310_vidx_311_vidx_312_vidx_313_vidx_314_vidx_315_vidx_316_vidx_317_vidx_318_vidx_319_vidx
_320_vidx_321_vidx_322_vidx_323_vidx_324_vidx_325_vidx_326_vidx_327_vidx_328_vidx_329_vidx_330_vidx_331_vidx_332_vidx_333_vidx_334_vidx_335_vidx_336_vidx_337_vidx_338_vidx_339_vidx_340_vidx_341_vidx_342_vidx_343_vidx_344_vidx_345_v
idx_346,Hostname=simon-trusty value=0.000000 1457450505
<snip>

[1] http://hekad.readthedocs.org/en/v0.10.0/config/common_sandbox_parameter.html

simonpasquier commented 8 years ago

IMO the issue is valid but should be renamed, something like "InfluxDB line encoder doesn't work with CPU stats because it contains array fields".

McStork commented 8 years ago

:+1: