aristanetworks / telegraf-cloudvision

Apache License 2.0
5 stars 1 forks source link

telegraph-cloudvision is leaking open connections #2

Closed dethi closed 1 year ago

dethi commented 1 year ago

The screenshot shows the increasing number of open connections, which was linked back to a single IP that match with a service account used by a customer for Telegraph

image
dethi commented 1 year ago

Not 100% sure, but it may be coming from the call to /api/resources/inventory/v1/Device/all. We keep re-creating new HTTP transport and clients every 15s instead of just reusing the same one. And since the connections are setup with no timeout (connect, read or idle), the connections is never properly closed and keep hanging, unless the connection is closed on the server side (idle timeout)

Reusing the client/transport would allow the open connection to be properly reused. I also recommend that an idle and read timeout is configured.

https://github.com/aristanetworks/telegraf-cloudvision/blob/main/plugins/inputs/arista_cloudvision_telemtry/arista_cloudvision_telemetry.go#L207-L211

burnyd commented 1 year ago

So @dethi here is what I am seeing and may cause in issue that is might be go related.. possibly that it is not properly closing the connection or might not be related at all that we need to fix anyways.

https://github.com/aristanetworks/telegraf-cloudvision/blob/main/plugins/inputs/arista_cloudvision_telemtry/arista_cloudvision_telemetry.go#L232

It seems like the way the inventory API works is that its the same data structure as gRPC but at times it gives us an extra new line. So some testing of my own this is what I get.

{"result":{"value":{"key":{"deviceId":"SN-DC1-SPINE2"},"status":"PROVISIONING_STATUS_SUCCESS","ztpMode":false,"ipAddress":{"value":"192.168.0.12"},"provisioningGroupName":"container_0c81bcb4-7e32-4900-a548-1006d8848393"},"time":"2023-02-06T16:45:37.310276561Z","type":"INITIAL"}}

Cannot marshall HTTP Connection to CVP

We can see between Cannot marshall HTTP connection to CVP and the data above it creates a newline. So we simply need to check to see that if the iterrator is more than or equal to 1. Here https://github.com/aristanetworks/telegraf-cloudvision/blob/main/plugins/inputs/arista_cloudvision_telemtry/arista_cloudvision_telemetry.go#L228

But within go it also closes the session with the defer resp.Body.Close() @hamptonmoore thoughts?