network-analytics / mdt-dialout-collector

Model-Driven Telemetry - Collecting <multi-vendor> metrics via gRPC dialout
MIT License
27 stars 8 forks source link

serialization: json not json_string #23

Closed hyberdk closed 1 year ago

hyberdk commented 1 year ago

I have been trying out this tool, looks very promissing..

there one thing though, is there a way I can make it serialize it into a json blob and not a json string where it escapes all the double quotes?

My use case is that I feed this into Kafka, and then ingest it into my ClickHouse database directly using a Kafka-Engine and then a materialized view and JSONEachRow which makes it easy for me to pick the objects I want and insert it into my table.

It sorta complicates things a bit if its not a "true" json object, but a string instead.. Can I turn change the behavior somewhere?

Esben

scuzzilla commented 1 year ago

Dear Esben,

Thank you for reaching out, your feedback is always appreciated.

The current implementation does not directly support producing unescaped JSON strings or JSON objects.

As far as I understand, Kafka essentially operates on raw bytes. Therefore, data, in this case, JSON objects, must be serialized before being sent to Kafka. This is why I convert JSON objects to JSON strings before producing to Kafka. The function I use from the JsonCpp library, Json::writeString(...), properly escapes certain characters to ensure the resulting JSON strings conform to the JSON specification.

In the meantime, if the escaped characters are causing difficulties in your current data flow, you might consider looking into parsing options on the consumer side (after consuming from Kafka, but before ingesting into ClickHouse) to convert these JSON strings back into objects.

I hope this clarifies the situation.

Salvatore.

hyberdk commented 1 year ago

Hi Salvatore,

thanks for your quick reply. The reason that I am a little puzzled is that not all json is actually escaped only the..

here is an example of an event in my kafka topic (note I formatted it a bit in vscode)

{
    "event_type": "gRPC",
    "seq": 27,
    "serialization": "json_string",
    "telemetry_data": "{\"collection_end_time\":0,\"collection_id\":1,\"collection_start_time\":1687466141733,\"data_gpbkv\":[{\"fields\":[{\"keys\":{\"fields\":[{\"name\":\"Vlan1\"}]}},{\"content\":{\"fields\":[{\"interface-type\":\"iana-iftype-propvirtual\"},{\"admin-status\":\"if-state-down\"},{\"oper-status\":\"if-oper-state-no-pass\"},{\"last-change\":\"2023-06-22T19:13:11.600000+00:00\"},{\"if-index\":12},{\"phys-address\":\"ac:7a:56:d1:a4:f4\"},{\"speed\":1000000000},{\"statistics\":{\"fields\":[{\"discontinuity-time\":\"2023-06-22T19:12:59.936000+00:00\"},{\"in-octets\":0},{\"in-unicast-pkts\":0},{\"in-broadcast-pkts\":0},{\"in-multicast-pkts\":0},{\"in-discards\":0},{\"in-errors\":0},{\"in-unknown-protos\":0},{\"out-octets\":0},{\"out-unicast-pkts\":0},{\"out-broadcast-pkts\":0},{\"out-multicast-pkts\":0},{\"out-discards\":0},{\"out-errors\":0},{\"rx-pps\":0},{\"rx-kbps\":0},{\"tx-pps\":0},{\"tx-kbps\":0},{\"num-flaps\":0},{\"in-crc-errors\":0},{\"in-discards-64\":0},{\"in-errors-64\":0},{\"in-unknown-protos-64\":0},{\"out-octets-64\":0}]}},{\"vrf\":\"\"},{\"ipv4\":\"0.0.0.0\"},{\"ipv4-subnet-mask\":\"0.0.0.0\"},{\"description\":\"Shut this one down to avoid various security issues.\"},{\"mtu\":1500},{\"input-security-acl\":\"\"},{\"output-security-acl\":\"\"},{\"v4-protocol-stats\":{\"fields\":[{\"in-pkts\":0},{\"in-octets\":0},{\"in-error-pkts\":0},{\"in-forwarded-pkts\":0},{\"in-forwarded-octets\":0},{\"in-discarded-pkts\":0},{\"out-pkts\":0},{\"out-octets\":0},{\"out-error-pkts\":0},{\"out-forwarded-pkts\":0},{\"out-forwarded-octets\":0},{\"out-discarded-pkts\":0}]}},{\"v6-protocol-stats\":{\"fields\":[{\"in-pkts\":0},{\"in-octets\":0},{\"in-error-pkts\":0},{\"in-forwarded-pkts\":0},{\"in-forwarded-octets\":0},{\"in-discarded-pkts\":0},{\"out-pkts\":0},{\"out-octets\":0},{\"out-error-pkts\":0},{\"out-forwarded-pkts\":0},{\"out-forwarded-octets\":0},{\"out-discarded-pkts\":0}]}},{\"bia-address\":\"ac:7a:56:d1:a4:f4\"},{\"ipv4-tcp-adjust-mss\":0},{\"ipv6-tcp-adjust-mss\":0},{\"intf-ext-state-support\":null},{\"intf-ext-state\":{\"fields\":[{\"error-type\":\"port-error-none\"},{\"port-error-reason\":\"port-err-none\"},{\"auto-mdix-enabled\":false},{\"mdix-oper-status-enabled\":false},{\"fec-enabled\":false},{\"mgig-downshift-enabled\":false}]}},{\"storm-control\":{\"fields\":[{\"broadcast\":{\"fields\":[{\"filter-state\":\"inactive\"},{\"current-rate\":{\"fields\":[{\"bandwidth\":0.0}]}}]}},{\"multicast\":{\"fields\":[{\"filter-state\":\"inactive\"},{\"current-rate\":{\"fields\":[{\"bandwidth\":0.0}]}}]}},{\"unicast\":{\"fields\":[{\"filter-state\":\"inactive\"},{\"current-rate\":{\"fields\":[{\"bandwidth\":0.0}]}}]}},{\"unknown-unicast\":{\"fields\":[{\"filter-state\":\"inactive\"},{\"current-rate\":{\"fields\":[{\"bandwidth\":0.0}]}}]}},{\"level-shared-support\":null},{\"level-shared\":{\"fields\":[{\"filter-state\":\"inactive\"},{\"current-rate\":{\"fields\":[{\"bandwidth\":0.0}]}}]}}]}},{\"auto-upstream-bandwidth\":0},{\"auto-downstream-bandwidth\":0},{\"ether-state\":{\"fields\":[{\"negotiated-duplex-mode\":\"unknown-duplex\"},{\"negotiated-port-speed\":\"speed-unknown\"},{\"auto-negotiate\":false},{\"enable-flow-control\":false},{\"media-type\":\"ether-media-type-none\"}]}},{\"ether-stats\":{\"fields\":[{\"in-mac-control-frames\":0},{\"in-mac-pause-frames\":0},{\"in-oversize-frames\":0},{\"in-jabber-frames\":0},{\"in-fragment-frames\":0},{\"in-8021q-frames\":0},{\"out-mac-control-frames\":0},{\"out-mac-pause-frames\":0},{\"out-8021q-frames\":0},{\"dot3-counters-supported\":null},{\"dot3-counters\":{\"fields\":[{\"dot3-stats-version\":\"not-supported\"},{\"dot3-error-counters-v2\":{\"fields\":[{\"dot3-alignment-errors\":0},{\"dot3-fcs-errors\":0},{\"dot3-single-collision-frames\":0},{\"dot3-multiple-collision-frames\":0},{\"dot3-sqe-test-errors\":0},{\"dot3-deferred-transmissions\":0},{\"dot3-late-collisions\":0},{\"dot3-excessive-collisions\":0},{\"dot3-internal-mac-transmit-errors\":0},{\"dot3-carrier-sense-errors\":0},{\"dot3-frame-too-longs\":0},{\"dot3-internal-mac-receive-errors\":0},{\"dot3-symbol-errors\":0},{\"dot3-duplex-status\":0},{\"dot3-hc-alignment-errors\":0},{\"dot3-hc-inpause-frames\":0},{\"dot3-hc-outpause-frames\":0},{\"dot3-hc-fcs-errors\":0},{\"dot3-hc-frame-too-longs\":0},{\"dot3-hc-internal-mac-transmit-errors\":0},{\"dot3-hc-internal-mac-receive-errors\":0},{\"dot3-hc-symbol-errors\":0}]}}]}}]}},{\"serial-state\":null},{\"serial-stats\":null},{\"syncserial-state\":{\"fields\":[{\"dce-mode-state\":null},{\"dte-mode-state\":null}]}}]}}],\"timestamp\":1687466142631}],\"encoding_path\":\"Cisco-IOS-XE-interfaces-oper:interfaces/interface\",\"msg_timestamp\":1687466142631,\"node_id\":\"dkhbo-dr01\",\"subscription_id\":\"101\"}",
    "telemetry_node": "10.12.19.49",
    "telemetry_port": 59547,
    "timestamp": 1687466142,
    "writer_id": "mdt-dialout-collector"
}

Its only the "telemetry_data" object that has the escaping, not the rest.. Im using nfacctd and telegraf to push data into kafka none of those escapes data like you do..

Esben

scuzzilla commented 1 year ago

Hi Esben,

The "telemetry_data" field contains a JSON formatted string, representing the telemetry data payload coming directly from the router. The other fields in the outer JSON structure are metadata added by the collector.

If you wish to work with the "telemetry_data" as a standard JSON object, you'll need to parse this string into a JSON object in your preferred programming language.

It's a bit more complicated than just working with a single JSON object, but it's a fairly common pattern when dealing with complex systems or when systems need to wrap a payload in additional metadata for transport.

Let me know if you have any other questions or if there's anything else I can help you with.

Salvatore.