ntop / ntopng

Web-based Traffic and Security Network Traffic Monitoring
http://www.ntop.org
GNU General Public License v3.0
6.03k stars 639 forks source link

Pro dashboard Top Application Graph "striping" issue... #268

Closed dboehlke closed 8 years ago

dboehlke commented 8 years ago

Today the Top Applications graph on my dashboard started "striping". I have included screenshots to show the behavior:

screen shot 2015-11-17 at 2 13 48 pm screen shot 2015-11-17 at 2 13 28 pm

The applications graphs in the "report" also show the "striping":

ntopng-report.pdf

I am not sure what started this, before today the graphs were solid.

I am running ntopng Professional v.2.1.151117 with nProbe v.7.3.151117. I have two nProbes running each collecting data from a different data center. I am collecting sFlow exports from Juniper EX series switches in those data centers.

Here is my configurations:

ntopng.conf:

$ cat /etc/ntopng/ntopng.conf
-G=/var/tmp/ntopng.pid
-d=/var/tmp/ntopng
-p=/etc/ntopng/protos.txt 
-i=tcp://10.60.59.14:5556
-i=tcp://10.60.59.14:5557
--dump-flows="es;ntopng;ntopng-%Y.%m.%d;http://localhost:9200/_bulk;"
--dns-mode=1
--sticky-hosts=none
--disable-login=1
--local-networks="2606:4A80::/32,192.168.0.0/24,172.16.0.0/16,10.0.0.0/8,38.81.66.0/23,209.208.232.0/23,209.208.241.0/24,209.208.250.0/24,50.93.246.0/23,50.93.255.0/24,162.222.47.0/24,216.17.8.0/24,38.92.136.0/24,162.222.40.0/21,162.222.40.0/23,162.222.46.0/24,103.8.239.0/24,149.5.7.0/24"
-w=3000

nprobe-usi-sflow.conf:

$ cat nprobe-usi-sflow.conf 
# Code42 "Local networks"
--local-networks="2606:4A80::/32,192.168.0.0/24,172.16.0.0/16,10.0.0.0/8,38.81.66.0/23,209.208.232.0/23,209.208.241.0/24,209.208.250.0/24,50.93.246.0/23,50.93.255.0/24,162.222.47.0/24,216.17.8.0/24,38.92.136.0/24,162.222.40.0/21,162.222.40.0/23,162.222.46.0/24,103.8.239.0/24,149.5.7.0/24"

# Use NETFLOWv9 when exporting flows to another application, add record formats for IPv6:
-V=9
-T="%IPV4_SRC_ADDR %IPV4_DST_ADDR %IPV4_NEXT_HOP %INPUT_SNMP %OUTPUT_SNMP %IN_PKTS %IN_BYTES %OUT_PKTS %OUT_BYTES %FIRST_SWITCHED %LAST_SWITCHED %L4_SRC_PORT %L4_DST_PORT %TCP_FLAGS %PROTOCOL %SRC_TOS %SRC_AS %DST_AS %IN_SRC_MAC %OUT_DST_MAC %IPV4_SRC_MASK %IPV4_DST_MASK %IPV6_SRC_ADDR %IPV6_SRC_MASK %IPV6_DST_ADDR %IPV6_DST_MASK %IPV6_NEXT_HOP %IP_PROTOCOL_VERSION %TOTAL_BYTES_EXP %TOTAL_PKTS_EXP %TOTAL_FLOWS_EXP "

# UDP port to collect sFlow from switches:
--collector-port=6343

# pid file location
-g=/var/tmp/nprobe-usi-sflow.pid

# zmq host
--zmq="tcp://10.60.59.14:5556"

# disable packet capture from interface:
-i=none

# don't export to netflow colletor:
-n=none

nprobe-sea-sflow.conf:

$ cat nprobe-sea-sflow.conf 
# Code42 "Local networks"
--local-networks="2606:4A80::/32,192.168.0.0/24,172.16.0.0/16,10.0.0.0/8,38.81.66.0/23,209.208.232.0/23,209.208.241.0/24,209.208.250.0/24,50.93.246.0/23,50.93.255.0/24,162.222.47.0/24,216.17.8.0/24,38.92.136.0/24,162.222.40.0/21,162.222.40.0/23,162.222.46.0/24,103.8.239.0/24,149.5.7.0/24"

# Use NETFLOWv9 when exporting flows to another application, add record formats for IPv6:
-V=9
-T="%IPV4_SRC_ADDR %IPV4_DST_ADDR %IPV4_NEXT_HOP %INPUT_SNMP %OUTPUT_SNMP %IN_PKTS %IN_BYTES %OUT_PKTS %OUT_BYTES %FIRST_SWITCHED %LAST_SWITCHED %L4_SRC_PORT %L4_DST_PORT %TCP_FLAGS %PROTOCOL %SRC_TOS %SRC_AS %DST_AS %IN_SRC_MAC %OUT_DST_MAC %IPV4_SRC_MASK %IPV4_DST_MASK %IPV6_SRC_ADDR %IPV6_SRC_MASK %IPV6_DST_ADDR %IPV6_DST_MASK %IPV6_NEXT_HOP %IP_PROTOCOL_VERSION %TOTAL_BYTES_EXP %TOTAL_PKTS_EXP %TOTAL_FLOWS_EXP "

# UDP port to collect sFlow from switches:
--collector-port=6344

# pid file location
-g=/var/tmp/nprobe-sea-sflow.pid

# zmq host
--zmq="tcp://10.60.59.14:5557"

# disable packet capture from interface:
-i=none

# don't export to netflow colletor:
-n=none

Thanks for taking a look.

simonemainardi commented 8 years ago

hello @dboehlke, thank you for reporting this issue.

When did you exactly start experiencing this issue? How long have you been using Professional v.2.1.151117? Has the striping issue occurred right after an ntopng update? I need to understand if it is an issue related to a particular release, or is something that occurs sporadically.

Could you also please inspect ntopng and nprobe logs, in particular around the empty stripes?

dboehlke commented 8 years ago

Hi,

I have been applying many of the daily builds. The issue appears to have started on 16-Nov-2015 at about 1:00pm CST. It has continued since, I am now running ntopng Professional v.2.1.151119. I have been running ntopng and nprobe about six weeks now. I am doing testing and figuring out how to best deal with the scale of my data center traffic.

The issue started after applying the update, but I have also been having trouble with the sflow collectors on my Juniper EX switches backing off the sampling until they really aren't sampling enough. I believe I have found reasonable sampling rates for my traffic levels now.

Neither ntopng or nprobe are logging much after they are started:

ntopng logs for 11/19 and 11/20:

root@okc-msp:/var/log/ntopng# zcat ntopng.log-20151119.gz
18/Nov/2015 12:30:53 [main.cpp:37] Shutting down...
18/Nov/2015 12:30:54 [main.cpp:34] Ok I am leaving now
18/Nov/2015 12:30:55 [Prefs.cpp:659] Using ElasticSearch for data dump [ntopng][ntopng-%Y.%m.%d][http://localhost:9200/_bulk]
18/Nov/2015 12:30:55 [Prefs.cpp:610] All HTTP user login disabled
18/Nov/2015 12:30:55 [Ntop.cpp:933] Setting local networks to 2606:4A80::/32,192.168.0.0/24,172.16.0.0/16,10.0.0.0/8,38.81.66.0/23,209.208.232.0/23,209.208.241.0/24,209.208.250.0/24,50.93.246.0/23,50.93.255.0/24,162.222.47.0/24,216.17.8.0/24,38.92.136.0/24,162.222.40.0/21,162.222.40.0/23,162.222.46.0/24,103.8.239.0/24,149.5.7.0/24
18/Nov/2015 12:30:55 [Redis.cpp:106] Successfully connected to redis 127.0.0.1:6379@0
18/Nov/2015 12:30:55 [NtopPro.cpp:120] [LICENSE] Reading license from /etc/ntopng.license
18/Nov/2015 12:30:55 [Ntop.cpp:1152] Registered interface tcp://10.60.59.14:5556 [id: 1]
18/Nov/2015 12:30:55 [Ntop.cpp:1152] Registered interface tcp://10.60.59.14:5557 [id: 11]
18/Nov/2015 12:30:55 [Ntop.cpp:1165] Registered interface view tcp://10.60.59.14:5556 [id: 1]
18/Nov/2015 12:30:55 [Ntop.cpp:1165] Registered interface view tcp://10.60.59.14:5557 [id: 11]
18/Nov/2015 12:30:55 [Utils.cpp:304] User changed to nobody
18/Nov/2015 12:30:55 [main.cpp:240] PID stored in file /var/tmp/ntopng.pid
18/Nov/2015 12:30:55 [HTTPserver.cpp:458] Please read https://github.com/ntop/ntopng/blob/dev/doc/README.SSL if you want to enable SSL.
18/Nov/2015 12:30:55 [HTTPserver.cpp:501] Web server dirs [/usr/share/ntopng/httpdocs][/usr/share/ntopng/scripts]
18/Nov/2015 12:30:55 [HTTPserver.cpp:504] HTTP server listening on port 3000
18/Nov/2015 12:30:55 [main.cpp:290] Working directory: /var/tmp/ntopng
18/Nov/2015 12:30:55 [main.cpp:292] Scripts/HTML pages directory: /usr/share/ntopng
18/Nov/2015 12:30:55 [Ntop.cpp:260] Welcome to ntopng x86_64 v.2.1.151118 - (C) 1998-15 ntop.org
18/Nov/2015 12:30:55 [Ntop.cpp:265] Built on Ubuntu 12.04.5 LTS
18/Nov/2015 12:30:55 [PeriodicActivities.cpp:53] Started periodic activities loop...
18/Nov/2015 12:30:55 [RuntimePrefs.cpp:32] Dumping alerts into syslog
18/Nov/2015 12:30:55 [NtopPro.cpp:234] [LICENSE] ntopng systemId: CDD6A54E9206AB23
18/Nov/2015 12:30:55 [NtopPro.cpp:245] [LICENSE] ntopng license: DCF4E7A8AFF4F27B49E12B8E19B2D51A1475685990201C735E
18/Nov/2015 12:30:55 [NtopPro.cpp:266] [LICENSE] Maintenance is available until Wed Oct  5 11:46:30 2016 [321 days left]
18/Nov/2015 12:30:55 [NetworkInterface.cpp:1427] Started packet polling on interface tcp://10.60.59.14:5556 [id: 1]...
18/Nov/2015 12:30:55 [NetworkInterface.cpp:1427] Started packet polling on interface tcp://10.60.59.14:5557 [id: 11]...
18/Nov/2015 12:30:56 [CollectorInterface.cpp:94] Collecting flows on tcp://10.60.59.14:5556
18/Nov/2015 12:30:56 [CollectorInterface.cpp:94] Collecting flows on tcp://10.60.59.14:5557
18/Nov/2015 12:43:42 [Flow.cpp:1325] WARNING: JSON Parse error: { "15": "0.0.0.0", "10": "583", "14": "597", "5": "0", "16": "62715", "17": "21554", "9": "0", "13": "0", "29": "0", "30": "0", "60": "4", "40": "0", "41": "0", "42": "184829" }
root@okc-msp:/var/log/ntopng# zcat ntopng.log-20151120.gz
19/Nov/2015 12:46:51 [main.cpp:37] Shutting down...
19/Nov/2015 12:46:52 [main.cpp:34] Ok I am leaving now
19/Nov/2015 12:46:53 [Prefs.cpp:659] Using ElasticSearch for data dump [ntopng][ntopng-%Y.%m.%d][http://localhost:9200/_bulk]
19/Nov/2015 12:46:53 [Prefs.cpp:610] All HTTP user login disabled
19/Nov/2015 12:46:53 [Ntop.cpp:933] Setting local networks to 2606:4A80::/32,192.168.0.0/24,172.16.0.0/16,10.0.0.0/8,38.81.66.0/23,209.208.232.0/23,209.208.241.0/24,209.208.250.0/24,50.93.246.0/23,50.93.255.0/24,162.222.47.0/24,216.17.8.0/24,38.92.136.0/24,162.222.40.0/21,162.222.40.0/23,162.222.46.0/24,103.8.239.0/24,149.5.7.0/24
19/Nov/2015 12:46:53 [Redis.cpp:106] Successfully connected to redis 127.0.0.1:6379@0
19/Nov/2015 12:46:53 [NtopPro.cpp:120] [LICENSE] Reading license from /etc/ntopng.license
19/Nov/2015 12:46:53 [Ntop.cpp:1152] Registered interface tcp://10.60.59.14:5556 [id: 1]
19/Nov/2015 12:46:53 [Ntop.cpp:1152] Registered interface tcp://10.60.59.14:5557 [id: 11]
19/Nov/2015 12:46:53 [Ntop.cpp:1165] Registered interface view tcp://10.60.59.14:5556 [id: 1]
19/Nov/2015 12:46:53 [Ntop.cpp:1165] Registered interface view tcp://10.60.59.14:5557 [id: 11]
19/Nov/2015 12:46:53 [Utils.cpp:304] User changed to nobody
19/Nov/2015 12:46:53 [main.cpp:240] PID stored in file /var/tmp/ntopng.pid
19/Nov/2015 12:46:53 [HTTPserver.cpp:458] Please read https://github.com/ntop/ntopng/blob/dev/doc/README.SSL if you want to enable SSL.
19/Nov/2015 12:46:53 [HTTPserver.cpp:501] Web server dirs [/usr/share/ntopng/httpdocs][/usr/share/ntopng/scripts]
19/Nov/2015 12:46:53 [HTTPserver.cpp:504] HTTP server listening on port 3000
19/Nov/2015 12:46:53 [main.cpp:290] Working directory: /var/tmp/ntopng
19/Nov/2015 12:46:53 [main.cpp:292] Scripts/HTML pages directory: /usr/share/ntopng
19/Nov/2015 12:46:53 [Ntop.cpp:260] Welcome to ntopng x86_64 v.2.1.151119 - (C) 1998-15 ntop.org
19/Nov/2015 12:46:53 [Ntop.cpp:265] Built on Ubuntu 12.04.5 LTS
19/Nov/2015 12:46:53 [PeriodicActivities.cpp:53] Started periodic activities loop...
19/Nov/2015 12:46:53 [RuntimePrefs.cpp:32] Dumping alerts into syslog
19/Nov/2015 12:46:53 [NtopPro.cpp:234] [LICENSE] ntopng systemId: CDD6A54E9206AB23
19/Nov/2015 12:46:53 [NtopPro.cpp:245] [LICENSE] ntopng license: DCF4E7A8AFF4F27B49E12B8E19B2D51A1475685990201C735E
19/Nov/2015 12:46:53 [NtopPro.cpp:266] [LICENSE] Maintenance is available until Wed Oct  5 11:46:30 2016 [320 days left]
19/Nov/2015 12:46:53 [NetworkInterface.cpp:1427] Started packet polling on interface tcp://10.60.59.14:5556 [id: 1]...
19/Nov/2015 12:46:53 [NetworkInterface.cpp:1427] Started packet polling on interface tcp://10.60.59.14:5557 [id: 11]...
19/Nov/2015 12:46:54 [CollectorInterface.cpp:94] Collecting flows on tcp://10.60.59.14:5556
19/Nov/2015 12:46:54 [CollectorInterface.cpp:94] Collecting flows on tcp://10.60.59.14:5557
root@okc-msp:/var/log/ntopng# cat ntopng.log
root@okc-msp:/var/log/ntopng#

SEA nprobe logs for 11/19 and 11/20:

root@nkc-msp:/var/log/nprobe# zcat nprobe-sea-sflow@0.log-20151119.gz
18/Nov/2015 12:31:31 [nprobe.c:3176] Valid nProbe license found
18/Nov/2015 12:31:31 [nprobe.c:4570] WARNING: The output interfaceId is set to 0: did you forget to use -Q perhaps ?
18/Nov/2015 12:31:31 [nprobe.c:4573] WARNING: The input interfaceId is set to 0: did you forget to use -u perhaps ?
18/Nov/2015 12:31:31 [nprobe.c:4651] Welcome to nProbe v.7.3.151118 ($Revision: 4688 $) for x86_64-unknown-linux-gnu with native PF_RING acceleration
18/Nov/2015 12:31:31 [nprobe.c:4661] Running on Ubuntu 12.04.5 LTS
18/Nov/2015 12:31:31 [nprobe.c:4672] [LICENSE] nProbe SystemId: CDD6CC6E9206AB23
18/Nov/2015 12:31:31 [nprobe.c:4683] [LICENSE] nProbe License:  80E1A5B213F8BB14907751DCAAC399A91475685586616C0457
18/Nov/2015 12:31:31 [nprobe.c:4686] [LICENSE] nProbe Edition:  Standard [without PF_RING Acceleration]
18/Nov/2015 12:31:31 [nprobe.c:4716] [LICENSE] Maintenance is available until Wed Oct  5 11:39:46 2016 [321 days left]
18/Nov/2015 12:31:31 [nprobe.c:6680] Welcome to nProbe v.7.3.151118 for x86_64-unknown-linux-gnu
18/Nov/2015 12:31:31 [nprobe.c:5938] Using NetFlow Packet Payload Len: 1472
18/Nov/2015 12:31:31 [plugin.c:1005] 0 plugin(s) enabled
18/Nov/2015 12:31:31 [nprobe.c:6335] Each flow is 118 bytes long
18/Nov/2015 12:31:31 [nprobe.c:6336] The # packets per flow has been set to 11
18/Nov/2015 12:31:31 [util.c:431] GeoIP: loaded AS config file /usr/share/ntopng/httpdocs/geoip/GeoIPASNum.dat
18/Nov/2015 12:31:31 [util.c:441] GeoIP: loaded AS IPv6 config file /usr/share/ntopng/httpdocs/geoip/GeoIPASNumv6.dat
18/Nov/2015 12:31:31 [nprobe.c:5223] Using packet capture length 128
18/Nov/2015 12:31:31 [nprobe.c:6982] Not capturing packet from interface (collector mode)
18/Nov/2015 12:31:31 [util.c:4011] Succesfully created ZMQ endpoint tcp://10.60.59.14:5557
18/Nov/2015 12:31:31 [collect.c:145] Flow collector listening on port 6344 (IPv4/v6)
18/Nov/2015 12:31:31 [nprobe.c:7194] nProbe started successfully
root@nkc-msp:/var/log/nprobe# zcat nprobe-sea-sflow@0.log-20151120.gz
19/Nov/2015 12:47:57 [nprobe.c:3176] Valid nProbe license found
19/Nov/2015 12:47:57 [nprobe.c:4570] WARNING: The output interfaceId is set to 0: did you forget to use -Q perhaps ?
19/Nov/2015 12:47:57 [nprobe.c:4573] WARNING: The input interfaceId is set to 0: did you forget to use -u perhaps ?
19/Nov/2015 12:47:57 [nprobe.c:4651] Welcome to nProbe v.7.3.151119 ($Revision: 4698 $) for x86_64-unknown-linux-gnu with native PF_RING acceleration
19/Nov/2015 12:47:57 [nprobe.c:4661] Running on Ubuntu 12.04.5 LTS
19/Nov/2015 12:47:57 [nprobe.c:4672] [LICENSE] nProbe SystemId: CDD6CC6E9206AB23
19/Nov/2015 12:47:57 [nprobe.c:4683] [LICENSE] nProbe License:  80E1A5B213F8BB14907751DCAAC399A91475685586616C0457
19/Nov/2015 12:47:57 [nprobe.c:4686] [LICENSE] nProbe Edition:  Standard [without PF_RING Acceleration]
19/Nov/2015 12:47:57 [nprobe.c:4716] [LICENSE] Maintenance is available until Wed Oct  5 11:39:46 2016 [320 days left]
19/Nov/2015 12:47:57 [nprobe.c:6679] Welcome to nProbe v.7.3.151119 for x86_64-unknown-linux-gnu
19/Nov/2015 12:47:57 [nprobe.c:5937] Using NetFlow Packet Payload Len: 1472
19/Nov/2015 12:47:57 [plugin.c:1005] 0 plugin(s) enabled
19/Nov/2015 12:47:57 [nprobe.c:6334] Each flow is 118 bytes long
19/Nov/2015 12:47:57 [nprobe.c:6335] The # packets per flow has been set to 11
19/Nov/2015 12:47:57 [util.c:431] GeoIP: loaded AS config file /usr/share/ntopng/httpdocs/geoip/GeoIPASNum.dat
19/Nov/2015 12:47:57 [util.c:441] GeoIP: loaded AS IPv6 config file /usr/share/ntopng/httpdocs/geoip/GeoIPASNumv6.dat
19/Nov/2015 12:47:57 [nprobe.c:5223] Using packet capture length 128
19/Nov/2015 12:47:57 [nprobe.c:6981] Not capturing packet from interface (collector mode)
19/Nov/2015 12:47:57 [util.c:4011] Succesfully created ZMQ endpoint tcp://10.60.59.14:5557
19/Nov/2015 12:47:57 [collect.c:145] Flow collector listening on port 6344 (IPv4/v6)
19/Nov/2015 12:47:57 [nprobe.c:7193] nProbe started successfully
19/Nov/2015 18:40:39 [nprobe.c:1003] WARNING: Unknown TCP option 14 received: malformed packet ? [135.84.215.70:41697 -> 149.40.222.162:1016][i: 0]
root@nkc-msp:/var/log/nprobe# cat nprobe-sea-sflow@0.log
root@nkc-msp:/var/log/nprobe#

MSP/USI nprobe logs for 11/19 and 11/20:

root@nkc-msp:/var/log/nprobe# zcat nprobe-usi-sflow@0.log-20151119.gz
18/Nov/2015 12:31:31 [nprobe.c:3176] Valid nProbe license found
18/Nov/2015 12:31:31 [nprobe.c:4570] WARNING: The output interfaceId is set to 0: did you forget to use -Q perhaps ?
18/Nov/2015 12:31:31 [nprobe.c:4573] WARNING: The input interfaceId is set to 0: did you forget to use -u perhaps ?
18/Nov/2015 12:31:31 [nprobe.c:4651] Welcome to nProbe v.7.3.151118 ($Revision: 4688 $) for x86_64-unknown-linux-gnu with native PF_RING acceleration
18/Nov/2015 12:31:31 [nprobe.c:4661] Running on Ubuntu 12.04.5 LTS
18/Nov/2015 12:31:31 [nprobe.c:4672] [LICENSE] nProbe SystemId: CDD6CC6E9206AB23
18/Nov/2015 12:31:31 [nprobe.c:4683] [LICENSE] nProbe License:  80E1A5B213F8BB14907751DCAAC399A91475685586616C0457
18/Nov/2015 12:31:31 [nprobe.c:4686] [LICENSE] nProbe Edition:  Standard [without PF_RING Acceleration]
18/Nov/2015 12:31:31 [nprobe.c:4716] [LICENSE] Maintenance is available until Wed Oct  5 11:39:46 2016 [321 days left]
18/Nov/2015 12:31:31 [nprobe.c:6680] Welcome to nProbe v.7.3.151118 for x86_64-unknown-linux-gnu
18/Nov/2015 12:31:31 [nprobe.c:5938] Using NetFlow Packet Payload Len: 1472
18/Nov/2015 12:31:31 [plugin.c:1005] 0 plugin(s) enabled
18/Nov/2015 12:31:31 [nprobe.c:6335] Each flow is 118 bytes long
18/Nov/2015 12:31:31 [nprobe.c:6336] The # packets per flow has been set to 11
18/Nov/2015 12:31:31 [util.c:431] GeoIP: loaded AS config file /usr/share/ntopng/httpdocs/geoip/GeoIPASNum.dat
18/Nov/2015 12:31:31 [util.c:441] GeoIP: loaded AS IPv6 config file /usr/share/ntopng/httpdocs/geoip/GeoIPASNumv6.dat
18/Nov/2015 12:31:31 [nprobe.c:5223] Using packet capture length 128
18/Nov/2015 12:31:31 [nprobe.c:6982] Not capturing packet from interface (collector mode)
18/Nov/2015 12:31:31 [util.c:4011] Succesfully created ZMQ endpoint tcp://10.60.59.14:5556
18/Nov/2015 12:31:31 [collect.c:145] Flow collector listening on port 6343 (IPv4/v6)
18/Nov/2015 12:31:31 [nprobe.c:7194] nProbe started successfully
root@nkc-msp:/var/log/nprobe# zcat nprobe-usi-sflow@0.log-20151120.gz
19/Nov/2015 12:47:57 [nprobe.c:3176] Valid nProbe license found
19/Nov/2015 12:47:57 [nprobe.c:4570] WARNING: The output interfaceId is set to 0: did you forget to use -Q perhaps ?
19/Nov/2015 12:47:57 [nprobe.c:4573] WARNING: The input interfaceId is set to 0: did you forget to use -u perhaps ?
19/Nov/2015 12:47:57 [nprobe.c:4651] Welcome to nProbe v.7.3.151119 ($Revision: 4698 $) for x86_64-unknown-linux-gnu with native PF_RING acceleration
19/Nov/2015 12:47:57 [nprobe.c:4661] Running on Ubuntu 12.04.5 LTS
19/Nov/2015 12:47:57 [nprobe.c:4672] [LICENSE] nProbe SystemId: CDD6CC6E9206AB23
19/Nov/2015 12:47:57 [nprobe.c:4683] [LICENSE] nProbe License:  80E1A5B213F8BB14907751DCAAC399A91475685586616C0457
19/Nov/2015 12:47:57 [nprobe.c:4686] [LICENSE] nProbe Edition:  Standard [without PF_RING Acceleration]
19/Nov/2015 12:47:57 [nprobe.c:4716] [LICENSE] Maintenance is available until Wed Oct  5 11:39:46 2016 [320 days left]
19/Nov/2015 12:47:57 [nprobe.c:6679] Welcome to nProbe v.7.3.151119 for x86_64-unknown-linux-gnu
19/Nov/2015 12:47:57 [nprobe.c:5937] Using NetFlow Packet Payload Len: 1472
19/Nov/2015 12:47:57 [plugin.c:1005] 0 plugin(s) enabled
19/Nov/2015 12:47:57 [nprobe.c:6334] Each flow is 118 bytes long
19/Nov/2015 12:47:57 [nprobe.c:6335] The # packets per flow has been set to 11
19/Nov/2015 12:47:57 [util.c:431] GeoIP: loaded AS config file /usr/share/ntopng/httpdocs/geoip/GeoIPASNum.dat
19/Nov/2015 12:47:57 [util.c:441] GeoIP: loaded AS IPv6 config file /usr/share/ntopng/httpdocs/geoip/GeoIPASNumv6.dat
19/Nov/2015 12:47:57 [nprobe.c:5223] Using packet capture length 128
19/Nov/2015 12:47:57 [nprobe.c:6981] Not capturing packet from interface (collector mode)
19/Nov/2015 12:47:57 [util.c:4011] Succesfully created ZMQ endpoint tcp://10.60.59.14:5556
19/Nov/2015 12:47:57 [collect.c:145] Flow collector listening on port 6343 (IPv4/v6)
19/Nov/2015 12:47:57 [nprobe.c:7193] nProbe started successfully
root@nkc-msp:/var/log/nprobe# cat nprobe-usi-sflow@0.log
root@nkc-msp:/var/log/nprobe#

My one thought would be to attempt to clear the counters on the Juniper switches, perhaps in the math done to make the graph my ports stats have reached a point where the results are exceeding the max for the type. I had an earlier problem with exponential numbers being passed to rrdtool, but that produced errors in the log. I have held off on clearing the counters until I heard back from you.

It affected the traffic graph as well as the application graph with the previous problem, this time it is just the application graph.

You can see when it started on this report. Something happened over the weekend prior to Nov-16 that caused a gap and then mid-day it started, this is probably when I installed the update:

Welcome to ntopng.pdf

Let me know if there is any additional debugging that I should turn on. Let me know if you think clearning the counters will be a useful data point. I would hate to clear them and have the problem go away only to return when the numbers get large again.

Thanks.

dboehlke commented 8 years ago

We had a reason to clear the counters on the switches running sflow today to diagnose a problem with errors on one of the interfaces.

Clearing the counters on the switches had no effect on the "striping" issue.

simonemainardi commented 8 years ago

@dboehlke thanks for reporting, and for giving such a detailed information.

I went through all git changes we made from Nov 16 until now. They should not be the cause of your issue.

It may be something related to either data collection or data visualization. The fact that stripes occur at pretty regular intervals of time may be evidence of data collection issues -- e.g, due to periodic sampling or overflows.

To try and figure it out, I would suggest you to manually inspect an RRD database to see if you can find the same 'holes' there. RRDs are just files, and you can find them under /var/tmp/ntopng/<interface_id>/rrd. To determine the integer <interface_id> simply hover the mouse on an interface name from the web gui menu 'Interfaces', and look at the hyperlink. Then navigate to the corresponding rrd folder and choose any suspicious RRD -- e.g, SSL.rrd, and inspect it using rrdtool dump utility, e.g.,

rrdtool dump /var/tmp/ntopng/4/rrd/SSL.rrd | less

Content examples are

                        <!-- 2015-11-23 18:25:00 CET / 1448299500 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 18:30:00 CET / 1448299800 --> <row><v>6.0762952110e+02</v></row>
                        <!-- 2015-11-23 18:35:00 CET / 1448300100 --> <row><v>6.0762952110e+02</v></row>
                        <!-- 2015-11-23 18:40:00 CET / 1448300400 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 18:45:00 CET / 1448300700 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 18:50:00 CET / 1448301000 --> <row><v>1.7198899297e+03</v></row>

Once you can browse RRD contents, you should compare them with the web gui to see if holes match or not.

thank you.

Please let me know.

Simone

dboehlke commented 8 years ago

Simone,

Thank you going over the changes and for the testing instructions. Looking at the SSL.rd file for interface 1 and comparing it to the graph for the last day's application traffic, they look like they match.

msp-usi app graph

                        <!-- 2015-11-22 14:20:00 CST / 1448223600 --> <row><v>7.7297340009e+08</v></row>
                        <!-- 2015-11-22 14:25:00 CST / 1448223900 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 14:30:00 CST / 1448224200 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 14:35:00 CST / 1448224500 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 14:40:00 CST / 1448224800 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 14:45:00 CST / 1448225100 --> <row><v>6.9137478355e+08</v></row>
                        <!-- 2015-11-22 14:50:00 CST / 1448225400 --> <row><v>6.9137478355e+08</v></row>
                        <!-- 2015-11-22 14:55:00 CST / 1448225700 --> <row><v>6.9130475651e+08</v></row>
                        <!-- 2015-11-22 15:00:00 CST / 1448226000 --> <row><v>6.9130475651e+08</v></row>
                        <!-- 2015-11-22 15:05:00 CST / 1448226300 --> <row><v>6.9433222121e+08</v></row>
                        <!-- 2015-11-22 15:10:00 CST / 1448226600 --> <row><v>6.9433222121e+08</v></row>
                        <!-- 2015-11-22 15:15:00 CST / 1448226900 --> <row><v>6.6790992338e+08</v></row>
                        <!-- 2015-11-22 15:20:00 CST / 1448227200 --> <row><v>6.6790992338e+08</v></row>
                        <!-- 2015-11-22 15:25:00 CST / 1448227500 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 15:30:00 CST / 1448227800 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 15:35:00 CST / 1448228100 --> <row><v>6.7246585439e+08</v></row>
                        <!-- 2015-11-22 15:40:00 CST / 1448228400 --> <row><v>6.7246585439e+08</v></row>
                        <!-- 2015-11-22 15:45:00 CST / 1448228700 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 15:50:00 CST / 1448229000 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 15:55:00 CST / 1448229300 --> <row><v>6.7003152653e+08</v></row>
                        <!-- 2015-11-22 16:00:00 CST / 1448229600 --> <row><v>6.7003152653e+08</v></row>
                        <!-- 2015-11-22 16:05:00 CST / 1448229900 --> <row><v>6.7005782398e+08</v></row>
                        <!-- 2015-11-22 16:10:00 CST / 1448230200 --> <row><v>6.7005782398e+08</v></row>
                        <!-- 2015-11-22 16:15:00 CST / 1448230500 --> <row><v>6.4372297832e+08</v></row>
                        <!-- 2015-11-22 16:20:00 CST / 1448230800 --> <row><v>6.4372297832e+08</v></row>
                        <!-- 2015-11-22 16:25:00 CST / 1448231100 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 16:30:00 CST / 1448231400 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 16:35:00 CST / 1448231700 --> <row><v>6.3393175138e+08</v></row>
                        <!-- 2015-11-22 16:40:00 CST / 1448232000 --> <row><v>6.3393175138e+08</v></row>
                        <!-- 2015-11-22 16:45:00 CST / 1448232300 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 16:50:00 CST / 1448232600 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 16:55:00 CST / 1448232900 --> <row><v>6.5664919636e+08</v></row>
                        <!-- 2015-11-22 17:00:00 CST / 1448233200 --> <row><v>6.5664919636e+08</v></row>
                        <!-- 2015-11-22 17:05:00 CST / 1448233500 --> <row><v>6.6112467513e+08</v></row>
                        <!-- 2015-11-22 17:10:00 CST / 1448233800 --> <row><v>6.6112467513e+08</v></row>
                        <!-- 2015-11-22 17:15:00 CST / 1448234100 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 17:20:00 CST / 1448234400 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 17:25:00 CST / 1448234700 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 17:30:00 CST / 1448235000 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 17:35:00 CST / 1448235300 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 17:40:00 CST / 1448235600 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 17:45:00 CST / 1448235900 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 17:50:00 CST / 1448236200 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 17:55:00 CST / 1448236500 --> <row><v>6.4616070969e+08</v></row>
                        <!-- 2015-11-22 18:00:00 CST / 1448236800 --> <row><v>6.4616070969e+08</v></row>
                        <!-- 2015-11-22 18:05:00 CST / 1448237100 --> <row><v>6.4673190341e+08</v></row>
                        <!-- 2015-11-22 18:10:00 CST / 1448237400 --> <row><v>6.4673190341e+08</v></row>
                        <!-- 2015-11-22 18:15:00 CST / 1448237700 --> <row><v>6.6103056491e+08</v></row>
                        <!-- 2015-11-22 18:20:00 CST / 1448238000 --> <row><v>6.6103056491e+08</v></row>
                        <!-- 2015-11-22 18:25:00 CST / 1448238300 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 18:30:00 CST / 1448238600 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 18:35:00 CST / 1448238900 --> <row><v>6.7495008384e+08</v></row>
                        <!-- 2015-11-22 18:40:00 CST / 1448239200 --> <row><v>6.7495008384e+08</v></row>
                        <!-- 2015-11-22 18:45:00 CST / 1448239500 --> <row><v>6.9613794082e+08</v></row>
                        <!-- 2015-11-22 18:50:00 CST / 1448239800 --> <row><v>6.9613794082e+08</v></row>
                        <!-- 2015-11-22 18:55:00 CST / 1448240100 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 19:00:00 CST / 1448240400 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 19:05:00 CST / 1448240700 --> <row><v>6.8382675924e+08</v></row>
                        <!-- 2015-11-22 19:10:00 CST / 1448241000 --> <row><v>6.8382675924e+08</v></row>
                        <!-- 2015-11-22 19:15:00 CST / 1448241300 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 19:20:00 CST / 1448241600 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 19:25:00 CST / 1448241900 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 19:30:00 CST / 1448242200 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 19:35:00 CST / 1448242500 --> <row><v>6.5720004578e+08</v></row>
                        <!-- 2015-11-22 19:40:00 CST / 1448242800 --> <row><v>6.5720004578e+08</v></row>
                        <!-- 2015-11-22 19:45:00 CST / 1448243100 --> <row><v>6.6967304765e+08</v></row>
                        <!-- 2015-11-22 19:50:00 CST / 1448243400 --> <row><v>6.6967304765e+08</v></row>
                        <!-- 2015-11-22 19:55:00 CST / 1448243700 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 20:00:00 CST / 1448244000 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 20:05:00 CST / 1448244300 --> <row><v>6.7181950186e+08</v></row>
                        <!-- 2015-11-22 20:10:00 CST / 1448244600 --> <row><v>6.7181950186e+08</v></row>
                        <!-- 2015-11-22 20:15:00 CST / 1448244900 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 20:20:00 CST / 1448245200 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 20:25:00 CST / 1448245500 --> <row><v>6.8101330761e+08</v></row>
                        <!-- 2015-11-22 20:30:00 CST / 1448245800 --> <row><v>6.8101330761e+08</v></row>
                        <!-- 2015-11-22 20:35:00 CST / 1448246100 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 20:40:00 CST / 1448246400 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 20:45:00 CST / 1448246700 --> <row><v>6.6537381533e+08</v></row>
                        <!-- 2015-11-22 20:50:00 CST / 1448247000 --> <row><v>6.6537381533e+08</v></row>
                        <!-- 2015-11-22 20:55:00 CST / 1448247300 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 21:00:00 CST / 1448247600 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 21:05:00 CST / 1448247900 --> <row><v>6.5973095039e+08</v></row>
                        <!-- 2015-11-22 21:10:00 CST / 1448248200 --> <row><v>6.5973095039e+08</v></row>
                        <!-- 2015-11-22 21:15:00 CST / 1448248500 --> <row><v>6.7034096597e+08</v></row>
                        <!-- 2015-11-22 21:20:00 CST / 1448248800 --> <row><v>6.7034096597e+08</v></row>
                        <!-- 2015-11-22 21:25:00 CST / 1448249100 --> <row><v>6.4563801277e+08</v></row>
                        <!-- 2015-11-22 21:30:00 CST / 1448249400 --> <row><v>6.4563801277e+08</v></row>
                        <!-- 2015-11-22 21:35:00 CST / 1448249700 --> <row><v>6.6445986065e+08</v></row>
                        <!-- 2015-11-22 21:40:00 CST / 1448250000 --> <row><v>6.6445986065e+08</v></row>
                        <!-- 2015-11-22 21:45:00 CST / 1448250300 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 21:50:00 CST / 1448250600 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 21:55:00 CST / 1448250900 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 22:00:00 CST / 1448251200 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 22:05:00 CST / 1448251500 --> <row><v>6.7367661040e+08</v></row>
                        <!-- 2015-11-22 22:10:00 CST / 1448251800 --> <row><v>6.7367661040e+08</v></row>
                        <!-- 2015-11-22 22:15:00 CST / 1448252100 --> <row><v>6.7856703655e+08</v></row>
                        <!-- 2015-11-22 22:20:00 CST / 1448252400 --> <row><v>6.7856703655e+08</v></row>
                        <!-- 2015-11-22 22:25:00 CST / 1448252700 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 22:30:00 CST / 1448253000 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 22:35:00 CST / 1448253300 --> <row><v>6.6519015902e+08</v></row>
                        <!-- 2015-11-22 22:40:00 CST / 1448253600 --> <row><v>6.6519015902e+08</v></row>
                        <!-- 2015-11-22 22:45:00 CST / 1448253900 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 22:50:00 CST / 1448254200 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 22:55:00 CST / 1448254500 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 23:00:00 CST / 1448254800 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 23:05:00 CST / 1448255100 --> <row><v>6.6057397165e+08</v></row>
                        <!-- 2015-11-22 23:10:00 CST / 1448255400 --> <row><v>6.6057397165e+08</v></row>
                        <!-- 2015-11-22 23:15:00 CST / 1448255700 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 23:20:00 CST / 1448256000 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 23:25:00 CST / 1448256300 --> <row><v>6.4566173629e+08</v></row>
                        <!-- 2015-11-22 23:30:00 CST / 1448256600 --> <row><v>6.4566173629e+08</v></row>
                        <!-- 2015-11-22 23:35:00 CST / 1448256900 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 23:40:00 CST / 1448257200 --> <row><v>NaN</v></row>
                        <!-- 2015-11-22 23:45:00 CST / 1448257500 --> <row><v>6.3134383547e+08</v></row>
                        <!-- 2015-11-22 23:50:00 CST / 1448257800 --> <row><v>6.3134383547e+08</v></row>
                        <!-- 2015-11-22 23:55:00 CST / 1448258100 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 00:00:00 CST / 1448258400 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 00:05:00 CST / 1448258700 --> <row><v>6.3679419415e+08</v></row>
                        <!-- 2015-11-23 00:10:00 CST / 1448259000 --> <row><v>6.3679419415e+08</v></row>
                        <!-- 2015-11-23 00:15:00 CST / 1448259300 --> <row><v>6.2841163784e+08</v></row>
                        <!-- 2015-11-23 00:20:00 CST / 1448259600 --> <row><v>6.2841163784e+08</v></row>
                        <!-- 2015-11-23 00:25:00 CST / 1448259900 --> <row><v>6.3715231016e+08</v></row>
                        <!-- 2015-11-23 00:30:00 CST / 1448260200 --> <row><v>6.3715231016e+08</v></row>
                        <!-- 2015-11-23 00:35:00 CST / 1448260500 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 00:40:00 CST / 1448260800 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 00:45:00 CST / 1448261100 --> <row><v>6.2886815201e+08</v></row>
                        <!-- 2015-11-23 00:50:00 CST / 1448261400 --> <row><v>6.2886815201e+08</v></row>
                        <!-- 2015-11-23 00:55:00 CST / 1448261700 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 01:00:00 CST / 1448262000 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 01:05:00 CST / 1448262300 --> <row><v>6.3725693088e+08</v></row>
                        <!-- 2015-11-23 01:10:00 CST / 1448262600 --> <row><v>6.3725693088e+08</v></row>
                        <!-- 2015-11-23 01:15:00 CST / 1448262900 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 01:20:00 CST / 1448263200 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 01:25:00 CST / 1448263500 --> <row><v>6.5940667380e+08</v></row>
                        <!-- 2015-11-23 01:30:00 CST / 1448263800 --> <row><v>6.5940667380e+08</v></row>
                        <!-- 2015-11-23 01:35:00 CST / 1448264100 --> <row><v>6.4603115243e+08</v></row>
                        <!-- 2015-11-23 01:40:00 CST / 1448264400 --> <row><v>6.4603115243e+08</v></row>
                        <!-- 2015-11-23 01:45:00 CST / 1448264700 --> <row><v>6.4858750788e+08</v></row>
                        <!-- 2015-11-23 01:50:00 CST / 1448265000 --> <row><v>6.4858750788e+08</v></row>
                        <!-- 2015-11-23 01:55:00 CST / 1448265300 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 02:00:00 CST / 1448265600 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 02:05:00 CST / 1448265900 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 02:10:00 CST / 1448266200 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 02:15:00 CST / 1448266500 --> <row><v>6.6468494905e+08</v></row>
                        <!-- 2015-11-23 02:20:00 CST / 1448266800 --> <row><v>6.6468494905e+08</v></row>
                        <!-- 2015-11-23 02:25:00 CST / 1448267100 --> <row><v>6.6584712883e+08</v></row>
                        <!-- 2015-11-23 02:30:00 CST / 1448267400 --> <row><v>6.6584712883e+08</v></row>
                        <!-- 2015-11-23 02:35:00 CST / 1448267700 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 02:40:00 CST / 1448268000 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 02:45:00 CST / 1448268300 --> <row><v>6.4627111715e+08</v></row>
                        <!-- 2015-11-23 02:50:00 CST / 1448268600 --> <row><v>6.4627111715e+08</v></row>
                        <!-- 2015-11-23 02:55:00 CST / 1448268900 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 03:00:00 CST / 1448269200 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 03:05:00 CST / 1448269500 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 03:10:00 CST / 1448269800 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 03:15:00 CST / 1448270100 --> <row><v>6.6895885428e+08</v></row>
                        <!-- 2015-11-23 03:20:00 CST / 1448270400 --> <row><v>6.6895885428e+08</v></row>
                        <!-- 2015-11-23 03:25:00 CST / 1448270700 --> <row><v>6.9844242568e+08</v></row>
                        <!-- 2015-11-23 03:30:00 CST / 1448271000 --> <row><v>6.9844242568e+08</v></row>
                        <!-- 2015-11-23 03:35:00 CST / 1448271300 --> <row><v>6.6985802573e+08</v></row>
                        <!-- 2015-11-23 03:40:00 CST / 1448271600 --> <row><v>6.6985802573e+08</v></row>
                        <!-- 2015-11-23 03:45:00 CST / 1448271900 --> <row><v>6.4003213344e+08</v></row>
                        <!-- 2015-11-23 03:50:00 CST / 1448272200 --> <row><v>6.4003213344e+08</v></row>
                        <!-- 2015-11-23 03:55:00 CST / 1448272500 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 04:00:00 CST / 1448272800 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 04:05:00 CST / 1448273100 --> <row><v>6.1924025743e+08</v></row>
                        <!-- 2015-11-23 04:10:00 CST / 1448273400 --> <row><v>6.1924025743e+08</v></row>
                        <!-- 2015-11-23 04:15:00 CST / 1448273700 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 04:20:00 CST / 1448274000 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 04:25:00 CST / 1448274300 --> <row><v>6.2878116300e+08</v></row>
                        <!-- 2015-11-23 04:30:00 CST / 1448274600 --> <row><v>6.2878116300e+08</v></row>
                        <!-- 2015-11-23 04:35:00 CST / 1448274900 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 04:40:00 CST / 1448275200 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 04:45:00 CST / 1448275500 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 04:50:00 CST / 1448275800 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 04:55:00 CST / 1448276100 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 05:00:00 CST / 1448276400 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 05:05:00 CST / 1448276700 --> <row><v>6.5874005187e+08</v></row>
                        <!-- 2015-11-23 05:10:00 CST / 1448277000 --> <row><v>6.5874005187e+08</v></row>
                        <!-- 2015-11-23 05:15:00 CST / 1448277300 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 05:20:00 CST / 1448277600 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 05:25:00 CST / 1448277900 --> <row><v>6.4051823982e+08</v></row>
                        <!-- 2015-11-23 05:30:00 CST / 1448278200 --> <row><v>6.4051823982e+08</v></row>
                        <!-- 2015-11-23 05:35:00 CST / 1448278500 --> <row><v>6.3794207999e+08</v></row>
                        <!-- 2015-11-23 05:40:00 CST / 1448278800 --> <row><v>6.3794207999e+08</v></row>
                        <!-- 2015-11-23 05:45:00 CST / 1448279100 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 05:50:00 CST / 1448279400 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 05:55:00 CST / 1448279700 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 06:00:00 CST / 1448280000 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 06:05:00 CST / 1448280300 --> <row><v>6.3918926459e+08</v></row>
                        <!-- 2015-11-23 06:10:00 CST / 1448280600 --> <row><v>6.3918926459e+08</v></row>
                        <!-- 2015-11-23 06:15:00 CST / 1448280900 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 06:20:00 CST / 1448281200 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 06:25:00 CST / 1448281500 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 06:30:00 CST / 1448281800 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 06:35:00 CST / 1448282100 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 06:40:00 CST / 1448282400 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 06:45:00 CST / 1448282700 --> <row><v>6.6942217670e+08</v></row>
                        <!-- 2015-11-23 06:50:00 CST / 1448283000 --> <row><v>6.6942217670e+08</v></row>
                        <!-- 2015-11-23 06:55:00 CST / 1448283300 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 07:00:00 CST / 1448283600 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 07:05:00 CST / 1448283900 --> <row><v>7.0371367837e+08</v></row>
                        <!-- 2015-11-23 07:10:00 CST / 1448284200 --> <row><v>7.0371367837e+08</v></row>
                        <!-- 2015-11-23 07:15:00 CST / 1448284500 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 07:20:00 CST / 1448284800 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 07:25:00 CST / 1448285100 --> <row><v>7.9030265691e+08</v></row>
                        <!-- 2015-11-23 07:30:00 CST / 1448285400 --> <row><v>7.9030265691e+08</v></row>
                        <!-- 2015-11-23 07:35:00 CST / 1448285700 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 07:40:00 CST / 1448286000 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 07:45:00 CST / 1448286300 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 07:50:00 CST / 1448286600 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 07:55:00 CST / 1448286900 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 08:00:00 CST / 1448287200 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 08:05:00 CST / 1448287500 --> <row><v>9.1698685988e+08</v></row>
                        <!-- 2015-11-23 08:10:00 CST / 1448287800 --> <row><v>9.1698685988e+08</v></row>
                        <!-- 2015-11-23 08:15:00 CST / 1448288100 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 08:20:00 CST / 1448288400 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 08:25:00 CST / 1448288700 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 08:30:00 CST / 1448289000 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 08:35:00 CST / 1448289300 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 08:40:00 CST / 1448289600 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 08:45:00 CST / 1448289900 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 08:50:00 CST / 1448290200 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 08:55:00 CST / 1448290500 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 09:00:00 CST / 1448290800 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 09:05:00 CST / 1448291100 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 09:10:00 CST / 1448291400 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 09:15:00 CST / 1448291700 --> <row><v>1.1233309863e+09</v></row>
                        <!-- 2015-11-23 09:20:00 CST / 1448292000 --> <row><v>1.1233309863e+09</v></row>
                        <!-- 2015-11-23 09:25:00 CST / 1448292300 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 09:30:00 CST / 1448292600 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 09:35:00 CST / 1448292900 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 09:40:00 CST / 1448293200 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 09:45:00 CST / 1448293500 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 09:50:00 CST / 1448293800 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 09:55:00 CST / 1448294100 --> <row><v>1.1899182083e+09</v></row>
                        <!-- 2015-11-23 10:00:00 CST / 1448294400 --> <row><v>1.1899182083e+09</v></row>
                        <!-- 2015-11-23 10:05:00 CST / 1448294700 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 10:10:00 CST / 1448295000 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 10:15:00 CST / 1448295300 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 10:20:00 CST / 1448295600 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 10:25:00 CST / 1448295900 --> <row><v>1.2130411013e+09</v></row>
                        <!-- 2015-11-23 10:30:00 CST / 1448296200 --> <row><v>1.2130411013e+09</v></row>
                        <!-- 2015-11-23 10:35:00 CST / 1448296500 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 10:40:00 CST / 1448296800 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 10:45:00 CST / 1448297100 --> <row><v>1.2133243807e+09</v></row>
                        <!-- 2015-11-23 10:50:00 CST / 1448297400 --> <row><v>1.2133243807e+09</v></row>
                        <!-- 2015-11-23 10:55:00 CST / 1448297700 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 11:00:00 CST / 1448298000 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 11:05:00 CST / 1448298300 --> <row><v>1.2428926358e+09</v></row>
                        <!-- 2015-11-23 11:10:00 CST / 1448298600 --> <row><v>1.2428926358e+09</v></row>
                        <!-- 2015-11-23 11:15:00 CST / 1448298900 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 11:20:00 CST / 1448299200 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 11:25:00 CST / 1448299500 --> <row><v>1.2792250385e+09</v></row>
                        <!-- 2015-11-23 11:30:00 CST / 1448299800 --> <row><v>1.2792250385e+09</v></row>
                        <!-- 2015-11-23 11:35:00 CST / 1448300100 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 11:40:00 CST / 1448300400 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 11:45:00 CST / 1448300700 --> <row><v>1.2212509833e+09</v></row>
                        <!-- 2015-11-23 11:50:00 CST / 1448301000 --> <row><v>1.2212509833e+09</v></row>
                        <!-- 2015-11-23 11:55:00 CST / 1448301300 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 12:00:00 CST / 1448301600 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 12:05:00 CST / 1448301900 --> <row><v>1.5508321072e+09</v></row>
                        <!-- 2015-11-23 12:10:00 CST / 1448302200 --> <row><v>1.5508321072e+09</v></row>
                        <!-- 2015-11-23 12:15:00 CST / 1448302500 --> <row><v>1.5597536453e+09</v></row>
                        <!-- 2015-11-23 12:20:00 CST / 1448302800 --> <row><v>1.5597536453e+09</v></row>
                        <!-- 2015-11-23 12:25:00 CST / 1448303100 --> <row><v>1.5817318788e+09</v></row>
                        <!-- 2015-11-23 12:30:00 CST / 1448303400 --> <row><v>1.5817318788e+09</v></row>
                        <!-- 2015-11-23 12:35:00 CST / 1448303700 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 12:40:00 CST / 1448304000 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 12:45:00 CST / 1448304300 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 12:50:00 CST / 1448304600 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 12:55:00 CST / 1448304900 --> <row><v>1.6016599606e+09</v></row>
                        <!-- 2015-11-23 13:00:00 CST / 1448305200 --> <row><v>1.6016599606e+09</v></row>
                        <!-- 2015-11-23 13:05:00 CST / 1448305500 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 13:10:00 CST / 1448305800 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 13:15:00 CST / 1448306100 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 13:20:00 CST / 1448306400 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 13:25:00 CST / 1448306700 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 13:30:00 CST / 1448307000 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 13:35:00 CST / 1448307300 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 13:40:00 CST / 1448307600 --> <row><v>1.8461540040e+09</v></row>
                        <!-- 2015-11-23 13:45:00 CST / 1448307900 --> <row><v>1.8461540040e+09</v></row>
                        <!-- 2015-11-23 13:50:00 CST / 1448308200 --> <row><v>1.6305250284e+09</v></row>
                        <!-- 2015-11-23 13:55:00 CST / 1448308500 --> <row><v>1.6305250284e+09</v></row>
                        <!-- 2015-11-23 14:00:00 CST / 1448308800 --> <row><v>NaN</v></row>
                        <!-- 2015-11-23 14:05:00 CST / 1448309100 --> <row><v>NaN</v></row>

Is there a difference between how the realtime tables collect and display and the last day view. Oddly the last day view for traffic is solid as is the Realtime Top Application Traffic. It seems like there is a constant flow of information coming into ntopng.

ntopng dashboard

Thanks.

Dan.

dboehlke commented 8 years ago

I had been "tuning" sflow polling and sampling on my Juniper switches. My router config backups say that I went to 30 seconds for polling the interface statistics. I am going to try every 20 seconds. When I first set up ntopng I was polling every second, but the sampling auto-adjustment would increase the number of packets between samples to astronomical numbers so that it was exporting a statistic ever blue moon or so on the busy interfaces. :-(

simonemainardi commented 8 years ago

Hi @dboehlke. Polling on your Juniper EX switches only regulates how often the sFlow data is sent to the collector.

The same switches implement a binary backoff algorithm that adapts the sampling for each interface. Basically, the algorithm periodically ranks interfaces based on the produced number of samples, and reduce top-interfaces sampling load by half allocating it to interfaces that have a lower sampling rate.

Said that, I am reasonably confident to confirm that your Juniper interfaces end up having a sampling so large that it is not guaranteed to have samples during every polling cycle.

It is not possible to manually change the adaptive sampling. Only a reboot should reset the backoff algorithm.

Do you think we can consider the issue solved -- that is, not related to ntopng?

dboehlke commented 8 years ago

I still don't under stand why the "Network Interfaces: Last Day View" is not striped, but "Top Application Traffic Last Day View" is. If the switches are not sending samples during every polling cycle then should not both be striped?

Where is the rrd file for the "Last Day View" kept?

I agree with you on the sflow sampling. The binary backoff has been a pain, the switch will run for many hours with the sampling rate I set and then suddenly backoff. I did discover that I can reset the back algorithm using the cli command "restart sflow-service all-members".

I hope to move the point where I am collecting the sflow samples to switches closer to the servers. I am collecting on the core switch for the data center now and there may be too much traffic there, I can't move the sflow collection now because the top of rack switches are layer2 devices and have only their out-of-band management connection to the network. I will need to deploy additional network connections and an nProbe on that network in order to collect flows there.

My only other thought would be to move to deploy nProbe on the individual servers or use taps between layers in the data center.

Dan B.

dboehlke commented 8 years ago

It looks like I need to take this up with Juniper it looks like something about my data center traffic causes JunOS' Adaptive Sampling to back off the sample rates until they are are not sampling at a a useful level.

To verify this, I set up IPT-netflow on one of the servers in the data center and exported Netflow V9 stats from that server to another instance of nProbe which supplied the data to ntopng by 0MQ. The Netflow V9 data from the host is unsampled and it does not display the striping issues shown by the sFlow fed nProbes.

It is also clear that the sampled sFlow data is reporting traffic levels much less than NetFlow from the server. The sFlow data missed a possible scan attack reported by the Netflow data.

I would still like to know where the rrd file is kept for the Last Day View and why it is not striped too.

It looks like it is safe to close this, if you can answer the two questions above for me.

Many Thanks.

simonemainardi commented 8 years ago

hi @dboehlke. Thanks for giving all these details. I think you are facing very interesting problems in you datacenter. I agree with you, it looks like adaptive sampling is failing to provide meaningful data. The verification via IPT-netflow should suffice to confirm. You should talk to Juniper. Please, write back to us when you have any updates. I and the rest of the ntop team would be glad to help.

The following is to answer your questions.

RRD is just one for every metric, there is no special RRD for "Last Day View". Each RRD periodically squashes/deletes older data to make room for newer data. You can configure RRDs via ntopng preferences.

The fact that last day statistics are striped for applications but not for interfaces depends on the ntopng sampling rate. Indeed, applications are sampled every 5 minutes, whereas interfaces every second. When we chart last day statistics, in order to reduce the number of datapoints shown, we make some aggregations that average out multiple points into a single one. I guess that since the number of data points available for interfaces in much greater, its aggregation is able to produce a continuous plot with non-zero points. However, this does not hold true for applications.

Simone

simonemainardi commented 8 years ago

@dboehlke I think it is safe to close this now.

dboehlke commented 8 years ago

After much frustration and even re-installation not fixing this, it appears to have been fixed in ntopng Professional v.2.3.151210. Still testing to be sure. Thanks to whomever is responsible.

Dan B.

dboehlke commented 8 years ago

I was wrong, there was an extended period without drop outs, but it is back to not working since then. Sorry.

lucaderi commented 8 years ago

@dboehlke Can you please check if the patch we committed worked? (packages are being rebuilt)