influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.5k stars 5.55k forks source link

Input plugin for Neptune Apex's XML data #5164

Closed MaxRenaud closed 5 years ago

MaxRenaud commented 5 years ago

I'm planning to implement this and want to see if a plugin aimed at a niche market would be accepted for inclusion in the telegraf project. There are supposedly 800k Apex in the wild owned by somewhat technology-savvy people but not necessary savvy enough to setup a full stack without a comprehensive guide.

Proposal:

I would like to write a plugin that parses the XML output of a Neptune Apex aquarium controller. An Apex acts as a life support controller for complex aquarium management. It can be used to gather various chemical data (pH, salinity, temperature, etc), gather energy data per outlet (Current status, output in A, output in W). The Apex controller itself can control devices directly connected to it but the offering is somewhat limited and my ultimate goal is to scrape the live status.xml page and publish it via MQTT.

Current behavior:

Telegraf does not support the Apex, nor XML collection

Desired behavior:

The ideal situation would be to implement https://github.com/influxdata/telegraf/issues/1758 however a universal XML input is harder to implement correctly when it must work with any valid XML. For example, how to differentiate between attributes and sub-objects on the line protocol? I would like this plugin to be specific to the Neptune Apex so some assumptions will made.

Use case: [Why is this important (helps with prioritizing requests)]

With very few exceptions, the Apex controller can only control what is directly attached to its smart power switch. Allowing this input opens its users to a plethora of smart reporting features and more importantly, home automation with MQTT. The current method of cooling down an aquarium revolves around starting a fan above it to trigger evaporative cooling and if that doesn't work, turn a chiller on. Chillers are somewhat inefficient and since they're usually sitting by the aquarium, they raise the air temperature to lower the water temperature. It would be better to control a smart thermometer and cool the house with an exterior condenser. Other use might include starting an air exchanger when the pH of the aquarium drops (Indicating a somewhat higher CO2 level in the air).

danielnelson commented 5 years ago

I think this would make a good plugin and I don't mind including a niche plugin like this so long as it as it is relatively straightforward and only uses the Go standard library. I wouldn't want to pull in dependencies for this but it sounds like it would be fairly straightforward. Do you have a link to the documentation for the API you would use?

MaxRenaud commented 5 years ago

Thanks for the reply. Neptune doesn't publish a schema. I've pasted the output of my device on pastebin: https://pastebin.com/MXXUNabs

danielnelson commented 5 years ago

Okay, well that does make it a bit harder to support, but is not a deal breaker. Is this collected with a HTTP get then? What sort of authentication is required?

MaxRenaud commented 5 years ago

I expect the output to be fairly deterministic with variable data enclosed as XML data and not fields. Because it's a hardware device, I don't expect new tags to be added on a basis that would inconvenience this collection. For example, any input is classified as a probe whether it's a real probe (Like a temperature probe) or data it reads from its internal sensors (Power outlet power usage). Any output for which it's authoritative is called an outlet. This includes physical outlets but also virtual outlets used to send commands to wireless devices and even hold states akin to a global variable (For example, the LEAK outlet is a global variable that can be changed based on a set of conditions). tldr: reads are "probes" and writes are "outlet". The XML is a simply get to http://[ip]/cgi-bin/status.xml and it's unauthenticated. The lack of authentication seems justified because it's read-only and the devices are usually placed on a LAN.

I also don't work for Neptune; I'm a novice aquarist who wants a bit more from this device so I cannot change anything server side. It's proprietary and closed source.

danielnelson commented 5 years ago

Sounds good, let me know when you have a metric schema and I might be able to provide some suggestions.

MaxRenaud commented 5 years ago

Thanks for your help! I'm new to InfluxDB and telegraf so your help is very much appreciated. From reading https://github.com/influxdata/telegraf/blob/master/docs/METRICS.md it seems like the internal metrics match the InfluxDB line protocol. Based on https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_tutorial/ I think the following could make sense. What do you think?

For everything: "measurement" = "apex" for everything "timestamp" = The field with second precision. "tag" = hostname= (More tags in the specific sections below)

For the non-repeating fields: "field_set": software=, hardware=, serial=, timezone=, power_failed= power_restored=

For probes: "tags": type=, name= "field_set": key="value" value=

For outlets: "tags": output_id=, device_id=, name= "field_set":key="state" value= , xstatus=

Questions:

danielnelson commented 5 years ago

Added feedback on schema to #5191