mutz0623 / zbx_history2json

real time exporting module for zabbix history data, using zabbix loadable module.
5 stars 1 forks source link

Feature request - add Application and maybe HostGroups #2

Open jedd opened 3 years ago

jedd commented 3 years ago

Howdi,

I've zero idea how hard this would be, or if it's even feasible from the ABI that Zabbix server provides.

We're using the JSON data output from here to duplicate history metrics into ElasticSearch. (The other two options we've discovered -- DirEntry & HistoryStorageURL -- aren't good or complete fits for this task.)

The problem for us is we're then missing Application grouping for the Items.

We'd also like to get HostGroup information along with each host & item entry that is generated by zbx_history2json.

Application is easy (on the output) insofar as there's only one Application per item / line of json output.

HostGroup is more complex as there's potentially 0, 1, or many HostGroups for a given host - so I'd guess the output would need to have a JSON array for HostGroups within each line of.

My questions therefore are -- is this something that's feasible to extract, using this loadable module approach - and if so, is it something that you'd be able & willing to add?

I did some C coding last century, and have started to poke around with the source here, but Zabbix server plumbing is extra-opaque to me. : )

cheers, Jedd.

i-ky commented 3 years ago

Hi!

I was just passing by.

The other two options we've discovered -- DirEntry & HistoryStorageURL -- aren't good or complete fits for this task.

What is DirEntry? Did you mean ExportDir and real-time export? If yes, I would be curious to know why it does not fit your needs.

jedd commented 3 years ago

Good question. Another colleague was evaluating that one, while I was playing with zbx_history2json.

Evidently the problem is types - all numerics come across as floats. (This may turn out to be inaccurate, or knob-adjustable, but the guy who explored that option is smarter than me, so I've got some confidence there.)

For our take-the-json-and-send-it-into-elastic scripts, we could potentially manipulate these data - (re)cast them on the way over - but we haven't explored that option.

FWIW, we have ~12k NVPS on our prod system, and I've confirmed with a fairly unoptimised python script, using elastic bulk.helper, I can parse and inject 10k integer metrics from the zbx_history2json output in ~700ms.

i-ky commented 3 years ago

all numerics come across as floats

This is surprising to me. Could you please double check with your colleague? To me this sounds like violation of the protocol and should be reported to Zabbix.

jedd commented 3 years ago

@i-ky thank you for pointing us back in the direction for this.

I've run it up in my lab, and it does look like it's giving us the right types. I'm not sure what happened there, but I'll explore this further tomorrow.

If I haven't come back to this ticket in 24h, please feel free to close it. : )

jedd commented 3 years ago

Okay, early observations.

Data type separation:

zbx_history2json splits history data based on type (int, float, string, text) - which works well, as we can have dedicated ETL processes per type, as we'll be having type-specific indexes in ElasticSearch.

ExportDir just has x number of workers (4 in my lab, and I haven't found a knob to turn yet to tune that - documentation says '4-30' which is mildly alarming if it's effectively unknown up front). Each output file is a blend of types of data. I can still parallelise, but each script now gets slightly more complex. It's not insurmountable, of course.

File sizing and ageing

zbx_history2json spits out a file per type, per day - file-int.2021-03-25.json , file-float.2021-03-25.json etc. I'd already modified the code to rotate these per hour (yyyy-mm-ddThh) so we could archive them off sooner (I want retention in case of Elastic outages, or if we need to replay for some other reason).

ExportDir seems to only have one option - the file size determination for log rotation, default set to 1GB, after which the current file is renamed to .old - so we'll have to manage rotation / retention manually.

Again, not insurmountable, just not as elegant as zbx_history2json ( for our purposes : )

i-ky commented 3 years ago

I haven't found a knob to turn yet to tune that - documentation says '4-30' which is mildly alarming if it's effectively unknown up front

It should be determined by Start* configuration settings for those process types that can potentially write export data. I guess Zabbix documentation writers were a bit too lazy to explicitly specify which process types these are. Or they don't want to be held accountable if these types change over time.

zbx_history2json spits out a file per type, per day

This may be a good feature request for Zabbix.

I think it is possible to do what you initially asked for, i.e. add applications and host groups, but this would be pure hacking, because it would require using Zabbix functionality that is not officially supported/available to modules. And it would basically duplicate Zabbix'es own code related to export.

But I have to say it again, I have no relation to this module. I am just an ex-Zabbix-developer and loadable module enthusiast that was passing by.

jedd commented 3 years ago

Ah, yes, it seems to be coupled to StartDBSyncers - which we can look at tweaking, but might not need to.

The parallelisation option of using this is probably better than the zbx_history2json split by types.

After spending some time getting my PoC script going (with ExportDir) I'm reasonably happy with the complexity (not much) and performance (very impressive).

zbx_history2json spits out a file per type, per day

This may be a good feature request for Zabbix.

We do have a commercial arrangement with Zabbix Corp, so we may pursue that. My crufty PoC code has been handed over to people who actually know how to code, so we'll see how problematic it is.

Current approach is to use pygtail to track location within the ndjson export files, as I was using with the zbx_history2json output, and using generated _id fields for the Elastic side, so it's all nicely idempotent. I'm able to re-run the last chunks of the ExportDir output files once the size-threshold triggers a filename change, without ending up with duplicates.

I've got some very basic wrappers around renaming with timestamps, and compressing / archiving the old files that are spat out - nothing terribly complex, but we're still working out retention and overall workflow for these data.

Thanks again for your pointers and assistance here btw. Much appreciated.