robcowart / elastiflow

Network flow analytics (Netflow, sFlow and IPFIX) with the Elastic Stack
Other
2.48k stars 590 forks source link

TOP talkers #128

Closed venki522 closed 5 years ago

venki522 commented 6 years ago

Hi Rob,

In the TOP-N dashboards..I see the TOP clients as 0.169.0.200 which is weird..the same Thing happens in TOP servers

Anyway I can see the actual client ip and sever ip

Thanks Venkatesh

robcowart commented 6 years ago

There could be a few different issues. Can you share me the full document data for one of the flow records where you see this?

venki522 commented 6 years ago

67e17c9c-4699-4d4d-8799-3e3356f334eb

robcowart commented 6 years ago

0.0.0.0/8 (i.e. any IP from 0.0.0.0 to 0.255.255.255) - Specified host on this network. It MUST NOT be sent, except as a source address as part of an initialization procedure by which the host learns its full IP address. Another common use for 0.0.0.0/8 is to designate ifIndex numbers, as for the "Link Data" value in OSPF, and for some BSD-ish interfaces.

I can't really see everything with that picture. You should copy/paste the test in to the issue.

Slarty70 commented 6 years ago

Same problem here. Src & dst addresses from the 0.0.0.0/8 range, extrem high packet counts and so on.

{ "_index": "elastiflow-3.0.3-2018.06.21", "_type": "doc", "_id": "v6ZrImQBQOWWMteh8qZi", "_version": 1, "_score": null, "_source": { "@version": "3.0.0", "tags": [], "event": { "host": "10.xx.xx.xx", "type": "netflow" }, "@timestamp": "2018-06-21T12:57:36.000Z", "node": { "hostname": "10.xx.xx.xx", "ipaddr": "10.xx.xx.xx" }, "netflow": { "first_switched": "2018-06-02T01:32:21.999Z", "flow_seq_num": 144038130, "flowset_id": 256, "last_switched": "2018-07-05T09:07:49.999Z", "version": 9, "flow_sampler_id": 1 }, "flow": { "dst_port_name": "HOPOPT/20566", "src_mask_len": 0, "next_hop": "178.237.36.10", "src_port_name": "HOPOPT/101", "src_addr": "0.0.0.60", "packets": 1969737420, "server_autonomous_system": "public", "src_hostname": "0.0.0.60", "bytes": 574554112, "service_port": "101", "service_name": "HOPOPT/101", "server_hostname": "0.0.0.60", "client_autonomous_system": "public", "autonomous_system": [ "public", "public" ], "dst_autonomous_system": "public", "direction": "undetermined", "tos": 162, "server_addr": "0.0.0.60", "dst_port": 20566, "client_addr": "0.0.0.1", "src_autonomous_system": "public", "ip_protocol": "HOPOPT", "dst_addr": "0.0.0.1", "ip_version": "IPv4", "src_port": 101, "input_snmp": 30055, "client_hostname": "0.0.0.1", "traffic_locality": "public", "dst_hostname": "0.0.0.1", "dst_mask_len": 6, "output_snmp": 52940 } }, "fields": { "netflow.first_switched": [ "2018-06-02T01:32:21.999Z" ], "@timestamp": [ "2018-06-21T12:57:36.000Z" ], "netflow.last_switched": [ "2018-07-05T09:07:49.999Z" ] }, "sort": [ 1529585856000 ] }

robcowart commented 6 years ago

@Slarty70 it looks like the flows are not being properly decoded by the Netflow codec. Can you provide a PCAP? You will need to make sure it runs long enough to capture a template. 5 mins or so should be enough.

Slarty70 commented 6 years ago

Maybe it's related to this one: https://quickview.cloudapps.cisco.com/quickview/bug/CSCvi23924 I do see "Expert Info (Warning/Malformed): Data (1392 bytes), no template found][Data (1392 bytes), no template found]" in pcap.

28580 530.159954 10.22.22.25 → 10.22.22.253 CFLOW 1454 total: 28 (v9) records Obs-Domain-ID= 0 [Data-Template:256] [Data:256] 28581 530.160160 10.22.22.26 → 10.22.22.253 CFLOW 1446 total: 28 (v9) records Obs-Domain-ID= 0 [Data-Template:256] [Data:257] 28584 530.161522 10.22.22.25 → 10.22.22.253 CFLOW 1446 total: 28 (v9) records Obs-Domain-ID= 0 [Data-Template:257] [Data:256] 28586 530.195633 10.22.22.26 → 10.22.22.253 CFLOW 1454 total: 28 (v9) records Obs-Domain-ID= 0 [Data-Template:257] [Data:257]

Per exporter the IDs for Data-Template and Date should be identical, right?

robcowart commented 6 years ago

I need a PCAP to troubleshoot this further. The capture must run long enough to include a template - usually 3-5 minutes. The command to generate the PCAP with tcpdump is:

sudo tcpdump -i <INTERFACE> udp port <PORTNUM> -w netflow.pcap -vvv

robcowart commented 6 years ago

@venki522 or @Slarty70, are either of you able to provide a PCAP for me to troubleshoot? You can also email to me at: rob (at) koiossian (dot) com

Slarty70 commented 6 years ago

You've got mail...

Slarty70 commented 5 years ago

As I already mentioned above, both of my Cisco exporters are obviously overwriting their netflow templates. Template-ID 256 and 257 are used by both. As soon, as I shut down one netflow export, everything works fine. With both exports active, one export is decoded correct and the other only produces crap. I haven't found a way to configure the template-id, so the only solution is to use separate logstash instances, I guess.

robcowart commented 5 years ago

Unfortunately multiple Logstash instances is the only workaround in this scenario.