cloudflare / goflow

The high-scalability sFlow/NetFlow/IPFIX collector used internally at Cloudflare.
BSD 3-Clause "New" or "Revised" License
852 stars 171 forks source link

Missing Flow Duration or Flow End Time #22

Closed debugloop closed 5 years ago

debugloop commented 5 years ago

https://github.com/cloudflare/goflow/blob/21654cda9826302f2bb745fcc07bad8d746d0895/producer/producer_nf.go#L274-L279

The above case statement decodes the nfv9 LAST_SWITCHED field (flowEndSysUpTime, as IPFIX calls it) into TimeFlow. I am wondering why the FIRST_SWITCHED (flowStartSysUpTime) field is not being decoded. The switch statement in 281 right below does the TimeFlow for IPFIX, but also only considers flowEnd* fields.

I was about to implement some way of having more timing information, but I am stuck with some basic issues:

  1. Renaming TimeFlow in protobuf would be bad.
  2. Adding TimeFlowEnd without having TimeFlowStart would be very confusing.
  3. Adding FlowDuration might be a solution, but it does not seem as convenient as having an absolute time.

I am leaning toward implementing option 3, but I thought I should gather some opinions first.

There might also be another issue implementing this: I am not sure in what order records arrive (line 138), and thus computing the FlowDuration might be difficult, depending on wether the flowMessage object has already been populated with FlowTime. As far as I understand right now (after some superficial reading) I can not access other raw records by ID or name in a reliable way, and I would be decoding that field twice then anyways.

lspgn commented 5 years ago

I think this was to minimize the delay with the flow is accounted originally but we never used more than a single time (and sFlow does not have this information).

I'd rather add TimeFlowStart and TimeFlowEnd for compatibility reasons and eventually removing TimeFlow.

debugloop commented 5 years ago

I probably won't use both times either, but I want to give customers the option of reproducing plain Netflow from their Kafka topics I aggregate for them by their IP prefixes, so they can continue to use their old tools. Thus, I am in need for some value to put into the first switched field.

Thanks for your input, I'll tackle this tomorrow then.

lspgn commented 5 years ago

@debugloop Could you test the new v3? I integrated your changes. https://github.com/cloudflare/goflow/releases/tag/v3.0.0.0

debugloop commented 5 years ago

@lspgn I can not easily test this right now, as you changed the protobuf definition (link to diff) heavily, and the rest of my tooling uses it too. Specifically, I've taken the old format and added some fields in the back I use for enrichment (geolocations, AS Paths, Interface/Peer names, ...).

I can see why you're changing some of the fields, but some strike me as unnecessary (switching id 3 and 4, ...) and I do not understand the large id skips in between (it is just colliding with my custom fields, I've started at 90) . I'd appreciate the introduction of a "reserved" range for custom fields, just a note in the docs or a comment maybe. At first I thought that you switched over to using the officiall IANA IPFIX ids, but that is not the case.

If you confirm that this'll be the version making it to the final, stable release I'll get started on figuring out how to introduce this with the least impact on my experimental setup and my colleagues (which are the only "users", so it is not too much trouble at this stage).

debugloop commented 5 years ago

Something else I've just noticed: I'd appreciate if you added the NFv9 Direction. The reason I'm writing this here instead of just doing it in another PR is that it might fit here https://github.com/cloudflare/goflow/commit/569608b4d1e6f50c26964e81a22ffb2d5b26a751#diff-9258e25e261124ae551eeb1b6b5d1d55R278 (as each message contains whether its IPFIX or v9, so one can discern)

I am not clear on the sematics of biFlowDirection though, so this might be a bad idea after all.

lspgn commented 5 years ago

Thank you for coming back to me!

You are raising good points. I mostly wanted a cleanup (rename, order) as also our internal tools will also be colliding with new fields anyway. The idea behind the skips of IDs was to group fields and allow new ones to be added below (eg: layer 4, hardware...). I would not switch to IANA fields as it is IPFIX specific.

I will rebuild a protobuf with the fields in the correct order and will indicate the expected reserved field. It makes more sense. But that will still require re-ordering on most of people's implementation.

Regarding your last comment: biFlowDirection was another request. I can add flowDirection (id 61) which is present in both NetFlow and IPFIX. I have never seen it used though: do you know how many bytes is this field in NetFlow? Which router/switch are you using?

debugloop commented 5 years ago

I can see why some cleanup is desirable, and in the end it does not make to much of a difference whether it is some or most fields in the protobuf. I figure I'll have to drain all topics (enriched or not) and upgrade all tooling in the maximum retention period interval (only 30mins right now) to avoid any Consumers misinterpreting conflicting fields. However, I'll wait with adopting until I have your ok to go ahead, I'd like to avoid doing it twice :)

My issue with the Direction is that it is part of my enrichment process, as it was part of the lookup of static Flow Exporter/Peer interface information base. These days however, I've integrated SNMP and BPG, and the only real use for this lookup is the Direction, which could be obtained freely using Netflow itself. So yes, I'd appreciate having it in.

We're using either Cisco ASR9k or NCS540 routers, all running some version of IOS-XR (mostly 6.4 or 6.5). The Direction field template says it is 1 Byte long. I think with that goflow will support any meaningful field exported by IOS-XR, see the template straight from Wireshark:

User Datagram Protocol, Src Port: 10090, Dst Port: 2055
Cisco NetFlow/IPFIX
    Version: 9
    Count: 1
    SysUptime: 2542165.834000000 seconds
    Timestamp: Apr 24, 2019 14:00:56.000000000 CEST
        CurrentSecs: 1556107256
    FlowSequence: 1924740437
    SourceId: 2177
    FlowSet 1 [id=0] (Data Template): 260
        FlowSet Id: Data Template (V9) (0)
        FlowSet Length: 92
        Template (Id = 260, Count = 21)
            Template Id: 260
            Field Count: 21
            Field (1/21): PKTS
                Type: PKTS (2)
                Length: 4
            Field (2/21): BYTES
                Type: BYTES (1)
                Length: 4
            Field (3/21): IP_SRC_ADDR
                Type: IP_SRC_ADDR (8)
                Length: 4
            Field (4/21): IP_DST_ADDR
                Type: IP_DST_ADDR (12)
                Length: 4
            Field (5/21): INPUT_SNMP
                Type: INPUT_SNMP (10)
                Length: 4
            Field (6/21): OUTPUT_SNMP
                Type: OUTPUT_SNMP (14)
                Length: 4
            Field (7/21): LAST_SWITCHED
                Type: LAST_SWITCHED (21)
                Length: 4
            Field (8/21): FIRST_SWITCHED
                Type: FIRST_SWITCHED (22)
                Length: 4
            Field (9/21): L4_SRC_PORT
                Type: L4_SRC_PORT (7)
                Length: 2
            Field (10/21): L4_DST_PORT
                Type: L4_DST_PORT (11)
                Length: 2
            Field (11/21): SRC_AS
                Type: SRC_AS (16)
                Length: 4
            Field (12/21): DST_AS
                Type: DST_AS (17)
                Length: 4
            Field (13/21): BGP_NEXT_HOP
                Type: BGP_NEXT_HOP (18)
                Length: 4
            Field (14/21): SRC_MASK
                Type: SRC_MASK (9)
                Length: 1
            Field (15/21): DST_MASK
                Type: DST_MASK (13)
                Length: 1
            Field (16/21): PROTOCOL
                Type: PROTOCOL (4)
                Length: 1
            Field (17/21): TCP_FLAGS
                Type: TCP_FLAGS (6)
                Length: 1
            Field (18/21): IP_TOS
                Type: IP_TOS (5)
                Length: 1
            Field (19/21): DIRECTION
                Type: DIRECTION (61)
                Length: 1
            Field (20/21): FORWARDING_STATUS
                Type: FORWARDING_STATUS (89)
                Length: 1
            Field (21/21): FLOW_SAMPLER_ID
                Type: FLOW_SAMPLER_ID (48)
                Length: 2
lspgn commented 5 years ago

Update of the protobuf: https://github.com/cloudflare/goflow/commit/40bba56a87b3a756246cdc997145f5473ddc3d75 Flow direction: https://github.com/cloudflare/goflow/commit/22783e3c1432f4a97c1e12c85d36b0dbd2c6922e

lspgn commented 5 years ago

Just merged into v2 / master. Closing this ticket.