Closed emalzer closed 1 year ago
Hi,
When the plugin receives a message via gRPC it then passes that message off to the upstream protobuf library to unmarshall the received data. In this upstream library is where the error about invalid UTF-8 data is created. At this point telegraf gets the failed to decode error and bails attempting to create a metric as there may not be anything received to create metrics from.
If additional messages were received without errors then those messages would continue to get parsed.
It is not clear to me if there is anything better for us to do here, we can have a look, but I think this is working as expected.
Hi,
we do not use gRPC, we use plain TCP as transport and self-describing-gpb as encoding.
I would need to identify exactly where this UTF-8 decoding problem is located - meaning is it on the receiving side with telegraf due to a bug or is it already on the sending side from the Cisco device.
we do not use gRPC, we use plain TCP as transport and self-describing-gpb as encoding.
The parsing of the metrics is the same see: https://github.com/influxdata/telegraf/blob/master/plugins/inputs/cisco_telemetry_mdt/cisco_telemetry_mdt.go#L361-L366
meaning is it on the receiving side with telegraf due to a bug or is it already on the sending side from the Cisco device.
As I mentioned above, this error happens during parsing a message we received. Meaning the message was produced by your device.
So, is there a way to get more debug logs to identify which invalid UTF-8 char it's complaining about? I still need to pinpoint the cause.
You could do a packet capture at the time of the error and see if you can look at the packet data. The other option is to possibly build a custom telegraf and log out the messages you are getting via MarshalTextString.
hm, the packet capture is to huge as there are a lot of interfaces that even only this single sensor-paths exports... I cannot find / or its quite hard for me to find this needle in the haystack.
I will try to take a look into the custom plugin then.
Hi!
I was successful identifying the issue. Thanks for your quick responses. Issue was a german special character which got wrong encoded due to whatever... :)
Just ss info if others encouter such an issue:
"google.golang.org/protobuf/proto"
with "github.com/golang/protobuf/proto"
as the google golang does not include the MarshalTextString
add c.acc.AddError(fmt.Errorf("MarshalTextString: %w", proto.MarshalTextString(msg)))
in handleTelemetry
which got wrong encoded due to whatever... :)
ugh that is frustrating for both you as a user and me
I am going to keep this open and see if we can improve that message as you have done here.
I put up #13963, which uses the msg.String()
method. Should do the same thing as what you did. The github.com/golang/protobuf library was superseded by the current library we use so this keeps us using that one.
Relevant telegraf.conf
Logs from Telegraf
System info
Telegraf 1.28.1-1, Ubuntu 20.04
Docker
No response
Steps to reproduce
Expected behavior
Decode all fields accordingly or skip only affected single metric.
Actual behavior
Metrics are decoded correct until invalid UTF-8 character is hit. All following metrics within that batch are lost. So we are missing metrics from certain interfaces and do not see some interfaces at all.
Additional info
I can provide tcpdumps with the telemetry traffic from the Cisco device to telegraf.