Closed pyke369 closed 1 year ago
Great find! if you can contribute a PR this would be awesome!
I'll give it a try. I was also thinking of exporting more stuff in the nf template (like at least the device-id, b/c right now you cannot associate a flow with the originating device that emitted it in the 1st place, that's a shame...). The info (and some other) is accessible in the received Cap'n'Proto kflow2 records.
That's a great idea to add in all the info there is into the exported flow. This data should be present in each flow record sent to the Format
function:
flows[0].CompanyId, flows[0].DeviceName, flows[0].DeviceId
I had a deeper look a the code and found there's already some way to limit the number of exported flows per batch/message, by simply setting the maxflowspermessage
configuration entry (default 10000, which is way too high in the netflow case).
The current netflow template yields 120 bytes data flowsets, and if you add the template flowset itself (92 bytes) and the netflow header (16 bytes), the maximum flows per exported message should be:
int((65535 - 16 - 92) / 120) = 545
So let's say 540
to be on the safe side. Setting a decent default value when the output format is netflow:
$ git diff config.go
diff --git a/config.go b/config.go
@@ -13,8 +13,10 @@ const (
// KentikAPITokenEnvVar is the environment variables used to get the Kentik API Token
KentikAPITokenEnvVar = "KENTIK_API_TOKEN"
+ // MaxNetflowsPerMessage is the maximum flowsets in a single netflow message
+ MaxNetflowsPerMessage = 540
)
@@ -494,6 +496,10 @@ func LoadConfig(configPath string) (*Config, error) {
return nil, err
}
+ if cfg.Format == "netflow" && (cfg.MaxFlowsPerMessage <= 0 || cfg.MaxFlowsPerMessage > MaxNetflowsPerMessage) {
+ cfg.MaxFlowsPerMessage = MaxNetflowsPerMessage
+ }
+
return cfg, nil
}
And voila, problem solved.
I will continue on adding the missing information you mentioned above (at least company-id and device-id, which may be fixed uint32 values).
I contributed this PR https://github.com/kentik/ktranslate/pull/553 for you to review (feel free to close this issue if needed). Thanks.
LGMT -- merged in, thanks for the deep dive!
Because Kentik is enriching input flows with additional fields (such as source/destination asnum), and also using an extended template to accomodate v4/v6 addresses, there may be situations where the total raw IPFIX packet size is > 64k (65536 bytes), leading to an invalid (overflowed/rolled-over) lengh value in the packet header, b/c of the cumulative size of the new data flowsets:
https://github.com/kentik/ktranslate/blob/main/pkg/formats/netflow/ipfix.go#L50
when using the
file
sink of thenet
sink in tcp mode, it produces invalid output from time to time (which can be circumvented by trying to read an extra 64k chunk of data and add it to the previous chunk when the next IPFIX message header seems "unsynchronized", a poor-man solution though).when using the
net
sink in udp mode, it yields some "Message is too large" errors (a udp datagram payload is also imited to 65536 bytes, the length being encoded as an uint16), leading to flows loss (the udp.Write() call just fails).I don't have a mitigation code at this point, but I could provide one if needed (the packer should be amended to generate multiple concatenated IPFIX messages, and the net/udp sink should loop to send a >64k buffer in multiple parts).