Open 0xMihir opened 1 year ago
@srebhan thoughts on this proposal?
@0xMihir I thought about such a feature and my idea was to declare a length-field and the use it in the "array"/"compound" like
[[inputs.socket_listener]]
service_address = "udp://:8094"
endianess = "le"
data_format = "binary"
[[inputs.socket_listener.binary]]
metric_name = "multi-example"
entries = [
{ bits = 16, omit = true },
{ assignment = "tag", name = "location", type = "string", terminator: "null" },
{ assignment = "field", name = "_array_len", type = "uint32},
{ assignment = "field", name = "arrvalue", type = "uint32", length = "@_array_len"},
{ assignment = "field", name = "_compound_len", type = "uint32},
{ assignment = "compound", name="entry", length = "@_compound_len", entries = [
{ name = "temperature", type = "float32" },
{ name = "humidity", type = "float32" },
{ type = "unix_ms", assignment = "time" }
]},
]
[inputs.socket_listener.binary.filter]
selection = [{ offset = 0, bits = 16, match = "0xCAFE" }]
and I would expect the code to expand the array-field-names by appending the indices like arrvalue_1
...arrvalue_N
and the compound path like entry_1_temperature
... entry_M_temperature
.
If the length
starts with an @
sign the length is dynamic but you could also simply put in a number. An alternative is to decide by value type in TOML...
My only concern is nesting... So maybe a dedicated parser for this kind of data would be better suited?
One issue with using fields in other places to determine the length of compound arrays is that we would need to implement a variable system for the parser. Maybe, we could use a termination character or sequence and then read until the sequence is detected—however, I'm not sure how to best handle overflows or underflows for this.
I think one way we could implement a parser is to recurse through the array of entries until everything has been read.
I'm not sure I get your point @0xMihir. We store the length-field as a field and can thus access it when looping over the entries of a "compound"... You might be right regarding recursion.
Honestly speaking this increase in complexity worries me a bit. I can foresee the next request will be about "can we make each of these compounds a new metric" with all kind of intermixes between fields global for every metric and some to be only specific... Don't get me wrong, I'm not completely against this, but keep those things in mind during your design!
Yeah, I agree. We shouldn't try to scope this to be a full-blown parser, as there would be far too many edge cases for where to define metrics. I'm going to continue to try and investigate the best method for implementing this.
Adding my vote for this functionality. I have a situation where a wireless controller is sending MQTT data in binary format, and the message contains some header information, a few payload fields, including the number of device records being sent, then repeating sets of data, about 5 fields for each device.
Use Case
Using a new assignment called
compound
, the binary parser could parse structs that have repeated parts.For example, the following struct contains a variable length of data readings from sensors to reduce the number of total packets sent.
Expected behavior
Using the binary parser, we can create an entry to parse the entire struct like so:
This is merely an example of what the final syntax could be. After reading the repeat count, the binary parser would write multiple entries at once.
Actual behavior
Currently, there's no way to parse different lengths of messages using the binary parser. The only possibility is parsing fixed-length structs.
Additional info
Some considerations that I haven't thoroughly thought about: