neptune-networks / flow-exporter

Export network flows from Kafka to Prometheus
https://brooks.sh/2019/11/17/network-flow-analysis-with-prometheus/
MIT License
125 stars 20 forks source link

kafka Failed to produce to topic pmacct.acct partition -1: Local: Queue full #11

Closed vishnubraj closed 4 years ago

vishnubraj commented 4 years ago

I am receiving the kafka local queue full in kafka server and i dont see any flow data while scraping the metrics URL.

There are extra fileds in the json output, does that matter?

{"event_type": "purge", "tag": 0, "label": "EDGE", "as_src": 25003, "as_dst": 0, "peer_ip_src": "10.0.15.252", "iface_in": 38, "iface_out": 29, "ip_src": "1.1.1.1", "ip_dst": "2.2.2.2", "port_src": 50902, "port_dst": 12476, "tcp_flags": "2", "ip_proto": "tcp", "stamp_inserted": "2020-04-06 17:08:00", "stamp_updated": "2020-04-06 17:10:07", "flows": 1, "packets": 1, "bytes": 40, "writer_id": "default_kafka/16196"}
paololucente commented 4 years ago

Hi @vishnubraj ,

You should be hitting the case of a too small librdkafka queue for the amount of data you are pushing through it (it could be just a spike or it could be you have to throw more resources to the Kafka broker, you will have to see).

To cover the case of a spike, you could read the following https://github.com/pmacct/pmacct/blob/1.7.4/QUICKSTART#L1064-#L1082 . Which means:

This will allow you to set a larger output queue. If this does not work and you get stuck again in the same situation then throw more resources at your Kafka broker.

Paolo

vishnubraj commented 4 years ago

Thanks for the quick response @paololucente let me try this config and get back.. meanwhile below mentioned is my current pmacct config file..

daemonize: true
debug: false
syslog: local4
logfile: /data/log/messages
nfacctd_port: 9996

plugins: kafka
kafka_broker_host: 10.0.1.139
kafka_broker_port: 9092
kafka_topic: pmacct.acct
kafka_refresh_time: 60
kafka_history: 1m
kafka_history_roundoff: m
plugin_pipe_kafka_retry: 30
#plugin_pipe_kafka: true
kafka_cache_entries: 1164110

aggregate: tag, label, peer_src_ip, src_host, dst_host, src_port, dst_port, proto, src_as, dst_as, in_iface, out_iface, tcpflags, flows
core_proc_name: nfacct_kafka

plugin_buffer_size: 10240000
plugin_pipe_size: 1024000000

pre_tag_map: /etc/pmacct/pretag.map
refresh_maps: true
pre_tag_map_entries: 38400
vishnubraj commented 4 years ago

@paololucente the kafka error messages are gone. But i still dont see the flow entries when i do

curl http://localhost:9590/metrics
paololucente commented 4 years ago

@vishnubraj , wonderful the errors are gone. That other issue is beyond me as i am the maker of pmacct. @bswinnerton may help perhaps. Paolo

bswinnerton commented 4 years ago

Hi @vishnubraj, glad to see you were able to get data into Kafka. And thank you @paololucente for weighing in on the pmacct side!

@vishnubraj, would you mind sharing your Prometheus config? You'll want something similar to the following to ensure that Prometheus scrapes the data out of flow exporter:

global:
  scrape_interval:     15s
  evaluation_interval: 15s
scrape_configs:
  - job_name: 'flow-exporter'
    scrape_interval: 5s
    static_configs:
      - targets: ['flow-exporter.fqdn.com:9590']

More info can be found here.

bswinnerton commented 4 years ago

Ah, re-reading your comment I realize I misunderstood where you were at in the process. I thought you meant Prometheus wasn't returning the information, but you're saying flow exporter isn't.

Do you have any logs from running flow exporter? It should look something like this:

flow-exporter    | time="2020-04-07T02:39:37Z" level=info msg="Fetching up to date AS database"
flow-exporter    | time="2020-04-07T02:39:43Z" level=info msg="Starting Kafka consumer"
flow-exporter    | time="2020-04-07T02:39:43Z" level=info msg="Starting Prometheus web server, available at: http://localhost:9590/metrics"
vishnubraj commented 4 years ago

Yes @bswinnerton i do get this logs.. but when i try to curl the url i dont get any any flow related metric..

time="2020-04-07T05:48:25Z" level=info msg="Fetching up to date AS database"
time="2020-04-07T05:48:31Z" level=info msg="Starting Kafka consumer"
time="2020-04-07T05:48:31Z" level=info msg="Starting Prometheus web server, available at: http://localhost:9590/metrics"
bswinnerton commented 4 years ago

It looks like the AS data you're getting into Kafka may not be complete, I noticed in the snippet from your original comment that you were getting a 0 for the dst_as:

"as_src": 25003, "as_dst": 0,

To test this theory, you should be able to start the flow exporter with --asn=0 to see if you start to get metrics. If you do, you'll likely need to modify your pmacct configuration to use networks_file as described in the README to associate a particular prefix with your ASN. As an example, if you were Google you might see something like this in that file:

15169,8.8.8.0/24

Which will associate any address like 8.8.8.8 with your ASN, 15169. More information can be found here under the "pmacctd base configuration" section.

vishnubraj commented 4 years ago

@bswinnerton its not working with asn=0 paramenter. let me check the pmacct config for networks_file and update here..

[:vishnu:root@opstest1.sjc2 /usr/local/bin]# ./flow-exporter --brokers=10.0.1.139:9092 --topic=pmacct.acct --asn=0
  -asn int
        The ASN being monitored
  -brokers string
        A comma separated list of Kafka brokers to connect to
  -partitions string
        The partitions to consume, can be 'all' or comma-separated numbers (default "all")
  -topic string
        The Kafka topic to consume from
[:vishnu:root@opstest1.sjc2 /usr/local/bin]#
vishnubraj commented 4 years ago

Thanks a lot @bswinnerton
it worked after i update the networks_file with the proper asn numbers

bswinnerton commented 4 years ago

Awesome, glad you were able to find that out @vishnubraj. I went ahead and added the ability to run the application with --asn=0 in #12 for debugging in the future.