Juniper / open-nti

Open Network Telemetry Collector build with open source tools
Apache License 2.0
233 stars 93 forks source link

Fluentd error for data from QFX5100. Not inserted into InfluxDB #237

Closed cbutera-sqsp closed 6 years ago

cbutera-sqsp commented 6 years ago

On the newest version of open-nti data from QFX5100 & EX4600 switches is no longer making it to InfluxDB. This was previously working.

I see there have been some changes to the database entries for the MX routers and have since updated the queries and they are now successfully displaying data in grafana. However with the switches the data is not making it to the database and I have received the following errors on the container running fluentd.

[root@server001 ~]# docker logs 6f98b2b045752018-08-20 19:05:46 +0000 fluent.error: {"error":"#<NoMethodError: undefined methodbuild_record' for #>","error_class":"NoMethodError","host":"192.168.1.1","message":"\"{\\"record-type\\":\\"traffic-stats\\",\\"time\\":1534791945994800,\\"router-id\\":\\"LEAF_SWITCH\\",\\"port\\":\\"et-0/0/27\\",\\"rxpkt\\":102744004887,\\"rxucpkt\\":102743748439,\\"rxmcpkt\\":251802,\\"rxbcpkt\\":4646,\\"rxpps\\":47761,\\"rxbyte\\":121955618750866,\\"rxbps\\":3628676224,\\"rxdroppkt\\":0,\\"rxcrcerr\\":0,\\"txpkt\\":63145169874,\\"txucpkt\\":63144913173,\\"txmcpkt\\":251742,\\"txbcpkt\\":4959,\\"txpps\\":25040,\\"txbyte\\":65755973673282,\\"txbps\\":1571957504,\\"txdroppkt\\":0,\\"txcrcerr\\":0}\n\" error=#<NoMethodError: undefined method build_record' for #<Fluent::TextParser::JuniperAnalyticsdParser:0x0000564060023a18>> error_class=NoMethodError host=\"192.168.1.1\""} 2018-08-20 19:05:46 +0000 [error]: "{\"record-type\":\"traffic-stats\",\"time\":1534791946230995,\"router-id\":\"HOSTNAME\",\"port\":\"et-1/0/27\",\"rxpkt\":103713156408,\"rxucpkt\":103712899925,\"rxmcpkt\":251740,\"rxbcpkt\":4743,\"rxpps\":43501,\"rxbyte\":123336416026615,\"rxbps\":3268278464,\"rxdroppkt\":0,\"rxcrcerr\":0,\"txpkt\":65497272558,\"txucpkt\":65497015859,\"txmcpkt\":251776,\"txbcpkt\":4923,\"txpps\":14437,\"txbyte\":68285164820513,\"txbps\":647824640,\"txdroppkt\":0,\"txcrcerr\":0}\n" error=#<NoMethodError: undefined methodbuild_record' for #> error_class=NoMethodError host="192.168.1.1" 2018-08-20 19:05:46 +0000 [error]: suppressed same stacktrace 2018-08-20 19:05:46 +0000 fluent.error: {"error":"#<NoMethodError: undefined method build_record' for #<Fluent::TextParser::JuniperAnalyticsdParser:0x0000564060023a18>>","error_class":"NoMethodError","host":"192.168.1.1","message":"\"{\\\"record-type\\\":\\\"traffic-stats\\\",\\\"time\\\":1534791946230995,\\\"router-id\\\":\\\"HOSTNAME\\\",\\\"port\\\":\\\"et-1/0/27\\\",\\\"rxpkt\\\":103713156408,\\\"rxucpkt\\\":103712899925,\\\"rxmcpkt\\\":251740,\\\"rxbcpkt\\\":4743,\\\"rxpps\\\":43501,\\\"rxbyte\\\":123336416026615,\\\"rxbps\\\":3268278464,\\\"rxdroppkt\\\":0,\\\"rxcrcerr\\\":0,\\\"txpkt\\\":65497272558,\\\"txucpkt\\\":65497015859,\\\"txmcpkt\\\":251776,\\\"txbcpkt\\\":4923,\\\"txpps\\\":14437,\\\"txbyte\\\":68285164820513,\\\"txbps\\\":647824640,\\\"txdroppkt\\\":0,\\\"txcrcerr\\\":0}\\n\" error=#<NoMethodError: undefined methodbuild_record' for #> error_class=NoMethodError host=\"192.168.1.1\""}`

3fr61n commented 6 years ago

Hi,

Which junos version are you using on both swiches? could you use another (older & newer) version?

Regards

cbutera-sqsp commented 6 years ago

I am running the juniper recommended version Junos 14.1X53-D47 https://kb.juniper.net/InfoCenter/index?page=content&id=KB21476&actp=METADATA

I cannot change version as my organization always runs recommended versions of code. Please note this was previously working.

Are there any troubleshooting tips you could give me as to why fluentd would not be inserting this information in the database properly

3fr61n commented 6 years ago

Hi

1.- Could you please attach the switches configuration? 2.- Isolation tests...(which switch is the one with the problem?) try to repeat your tests but with only one switch sending telemetry data at a time 3.- Which sensors are you enabling? perhaps trying to enable one sensor at a time to see which one is generating the error.

Regards

cbutera-sqsp commented 6 years ago

1.Switch Configuration. (.set) set services analytics export-profiles default_export_profile stream-format json set services analytics export-profiles default_export_profile interface information set services analytics export-profiles default_export_profile interface statistics traffic set services analytics export-profiles default_export_profile interface statistics queue set services analytics export-profiles default_export_profile interface status link set services analytics export-profiles default_export_profile system information set services analytics export-profiles default_export_profile system status traffic set services analytics export-profiles default_export_profile system status queue set services analytics resource-profiles default_resource_profile queue-monitoring set services analytics resource-profiles default_resource_profile traffic-monitoring set services analytics resource-profiles default_resource_profile latency-threshold high 2300 set services analytics resource-profiles default_resource_profile latency-threshold low 20 set services analytics resource system polling-interval traffic-monitoring 2 set services analytics resource system polling-interval queue-monitoring 1000 set services analytics resource interfaces et-0/0/24 resource-profile default_resource_profile set services analytics resource interfaces et-1/0/24 resource-profile default_resource_profile set services analytics collector address 10.119.30.72 port 50020 transport udp export-profile default_export_profile

  1. I sent telemetry data one at a time from 2 switches 1 QFX5100 & 1 EX4600 both had the same result of fluentd errors and data not successfully placed in the database.

  2. The Switch configuration is a bit different from the routers which you enable specific sensors. Any suggestions based on the configuration on what I should be enabling/disabling?

any thoughts? @3fr61n

Seems to me like a change in fluentd was made which is causing this to fail

psagrera commented 6 years ago

I've created a new branch called analytics, could you please clone this new branch (git clone https://github.com/Juniper/open-nti.git -b analytics), and execute "make build" + "make start" and see if you still see the above errors ?

Regards

cbutera-sqsp commented 6 years ago

@psagrera Looks like I am still receiving the same error when using your branch. Please let me know if I could provide any info to assist in troubleshooting this I will leave both servers running. 2018-09-13 15:56:51 +0000 [error]: "{\"record-type\":\"traffic-stats\",\"time\":1536854210919407,\"router-id\":\"spine-switch001\",\"port\":\"et-0/0/19\",\"rxpkt\":516564922242,\"rxucpkt\":516564559177,\"rxmcpkt\":354723,\"rxbcpkt\":8342,\"rxpps\":74534,\"rxbyte\":365209380317972,\"rxbps\":3982606400,\"rxdroppkt\":0,\"rxcrcerr\":0,\"txpkt\":464290902342,\"txucpkt\":464290539018,\"txmcpkt\":354725,\"txbcpkt\":8599,\"txpps\":50151,\"txbyte\":318839240571217,\"txbps\":2154658432,\"txdroppkt\":0,\"txcrcerr\":0}\n" error=#<NoMethodError: undefined methodbuild_record' for #> error_class=NoMethodError host="192.168.0.1" 2018-09-13 15:56:51 +0000 [error]: suppressed same stacktrace 2018-09-13 15:56:51 +0000 [error]: "{\"record-type\":\"traffic-stats\",\"time\":1536854210919407,\"router-id\":\"spine-switch001\",\"port\":\"et-0/0/22\",\"rxpkt\":1044380897564,\"rxucpkt\":1044371890127,\"rxmcpkt\":8995467,\"rxbcpkt\":11970,\"rxpps\":94717,\"rxbyte\":919624790958648,\"rxbps\":4218498304,\"rxdroppkt\":0,\"rxcrcerr\":0,\"txpkt\":1126462409795,\"txucpkt\":1126453429081,\"txmcpkt\":8968941,\"txbcpkt\":11773,\"txpps\":137769,\"txbyte\":1015154816429291,\"txbps\":7865434048,\"txdroppkt\":0,\"txcrcerr\":0}\n" error=#<NoMethodError: undefined method build_record' for #<Fluent::TextParser::JuniperAnalyticsdParser:0x000055fb7cb9ca30>> error_class=NoMethodError host="192.168.0.1" 2018-09-13 15:56:51 +0000 [error]: suppressed same stacktrace 2018-09-13 15:56:51 +0000 [error]: "{\"record-type\":\"traffic-stats\",\"time\":1536854210919407,\"router-id\":\"spine-switch001\",\"port\":\"et-0/0/23\",\"rxpkt\":996808207117,\"rxucpkt\":996799193019,\"rxmcpkt\":9002210,\"rxbcpkt\":11888,\"rxpps\":117816,\"rxbyte\":863897511463712,\"rxbps\":6546907840,\"rxdroppkt\":0,\"rxcrcerr\":0,\"txpkt\":1183681161242,\"txucpkt\":1183672179314,\"txmcpkt\":8970092,\"txbcpkt\":11836,\"txpps\":228656,\"txbyte\":1034971286004046,\"txbps\":12829294016,\"txdroppkt\":0,\"txcrcerr\":0}\n" error=#<NoMethodError: undefined methodbuild_record' for #> error_class=NoMethodError host="192.168.0.1" 2018-09-13 15:56:51 +0000 [error]: suppressed same stacktrace 2018-09-13 15:56:51 +0000 fluent.error: {"error":"#<NoMethodError: undefined method build_record' for #<Fluent::TextParser::JuniperAnalyticsdParser:0x000055fb7cb9ca30>>","error_class":"NoMethodError","host":"192.168.0.1","message":"\"{\\\"record-type\\\":\\\"traffic-stats\\\",\\\"time\\\":1536854210919407,\\\"router-id\\\":\\\"spine-switch001\\\",\\\"port\\\":\\\"et-0/0/18\\\",\\\"rxpkt\\\":600196618536,\\\"rxucpkt\\\":600196258957,\\\"rxmcpkt\\\":355090,\\\"rxbcpkt\\\":4489,\\\"rxpps\\\":39802,\\\"rxbyte\\\":528828092485797,\\\"rxbps\\\":1452790720,\\\"rxdroppkt\\\":0,\\\"rxcrcerr\\\":10,\\\"txpkt\\\":669779912758,\\\"txucpkt\\\":669779553203,\\\"txmcpkt\\\":355084,\\\"txbcpkt\\\":4471,\\\"txpps\\\":61403,\\\"txbyte\\\":632170445990833,\\\"txbps\\\":3746685888,\\\"txdroppkt\\\":0,\\\"txcrcerr\\\":0}\\n\" error=#<NoMethodError: undefined methodbuild_record' for #> error_class=NoMethodError host=\"192.168.0.1\""} 2018-09-13 15:56:51 +0000 fluent.error: {"error":"#<NoMethodError: undefined method build_record' for #<Fluent::TextParser::JuniperAnalyticsdParser:0x000055fb7cb9ca30>>","error_class":"NoMethodError","host":"192.168.0.1","message":"\"{\\\"record-type\\\":\\\"traffic-stats\\\",\\\"time\\\":1536854210919407,\\\"router-id\\\":\\\"spine-switch001\\\",\\\"port\\\":\\\"et-0/0/19\\\",\\\"rxpkt\\\":516564922242,\\\"rxucpkt\\\":516564559177,\\\"rxmcpkt\\\":354723,\\\"rxbcpkt\\\":8342,\\\"rxpps\\\":74534,\\\"rxbyte\\\":365209380317972,\\\"rxbps\\\":3982606400,\\\"rxdroppkt\\\":0,\\\"rxcrcerr\\\":0,\\\"txpkt\\\":464290902342,\\\"txucpkt\\\":464290539018,\\\"txmcpkt\\\":354725,\\\"txbcpkt\\\":8599,\\\"txpps\\\":50151,\\\"txbyte\\\":318839240571217,\\\"txbps\\\":2154658432,\\\"txdroppkt\\\":0,\\\"txcrcerr\\\":0}\\n\" error=#<NoMethodError: undefined methodbuild_record' for #> error_class=NoMethodError host=\"192.168.0.1\""} 2018-09-13 15:56:51 +0000 fluent.error: {"error":"#<NoMethodError: undefined method build_record' for #<Fluent::TextParser::JuniperAnalyticsdParser:0x000055fb7cb9ca30>>","error_class":"NoMethodError","host":"192.168.0.1","message":"\"{\\\"record-type\\\":\\\"traffic-stats\\\",\\\"time\\\":1536854210919407,\\\"router-id\\\":\\\"spine-switch001\\\",\\\"port\\\":\\\"et-0/0/22\\\",\\\"rxpkt\\\":1044380897564,\\\"rxucpkt\\\":1044371890127,\\\"rxmcpkt\\\":8995467,\\\"rxbcpkt\\\":11970,\\\"rxpps\\\":94717,\\\"rxbyte\\\":919624790958648,\\\"rxbps\\\":4218498304,\\\"rxdroppkt\\\":0,\\\"rxcrcerr\\\":0,\\\"txpkt\\\":1126462409795,\\\"txucpkt\\\":1126453429081,\\\"txmcpkt\\\":8968941,\\\"txbcpkt\\\":11773,\\\"txpps\\\":137769,\\\"txbyte\\\":1015154816429291,\\\"txbps\\\":7865434048,\\\"txdroppkt\\\":0,\\\"txcrcerr\\\":0}\\n\" error=#<NoMethodError: undefined methodbuild_record' for #> error_class=NoMethodError host=\"192.168.0.1\""} 2018-09-13 15:56:51 +0000 fluent.error: {"error":"#<NoMethodError: undefined method build_record' for #<Fluent::TextParser::JuniperAnalyticsdParser:0x000055fb7cb9ca30>>","error_class":"NoMethodError","host":"192.168.0.1","message":"\"{\\\"record-type\\\":\\\"traffic-stats\\\",\\\"time\\\":1536854210919407,\\\"router-id\\\":\\\"spine-switch001\\\",\\\"port\\\":\\\"et-0/0/23\\\",\\\"rxpkt\\\":996808207117,\\\"rxucpkt\\\":996799193019,\\\"rxmcpkt\\\":9002210,\\\"rxbcpkt\\\":11888,\\\"rxpps\\\":117816,\\\"rxbyte\\\":863897511463712,\\\"rxbps\\\":6546907840,\\\"rxdroppkt\\\":0,\\\"rxcrcerr\\\":0,\\\"txpkt\\\":1183681161242,\\\"txucpkt\\\":1183672179314,\\\"txmcpkt\\\":8970092,\\\"txbcpkt\\\":11836,\\\"txpps\\\":228656,\\\"txbyte\\\":1034971286004046,\\\"txbps\\\":12829294016,\\\"txdroppkt\\\":0,\\\"txcrcerr\\\":0}\\n\" error=#<NoMethodError: undefined methodbuild_record' for #> error_class=NoMethodError host=\"192.168.0.1\""}`

psagrera commented 6 years ago

Hi @nastyfast

Cold you please pull latest version of analytics branch and try ?

https://github.com/Juniper/open-nti/commit/de718680b82ab5075c4005c6d7746826c212616e

git clone https://github.com/Juniper/open-nti.git -b analytics + make build + make restart

(Remove old psagrera/fluent-jti:analytics-1.0 for saving space :) )

Pablo

cbutera-sqsp commented 6 years ago

@psagrera still not working but I am now receiving a different error message.

2018-09-14 15:03:33 +0000 [info]: Connecting to database: juniper, host: opennti, port: 8086, username: juniper, use_ssl = false, verify_ssl = true 2018-09-14 15:03:33 +0000 [info]: Connecting to database: juniper, host: opennti, port: 8086, username: juniper, use_ssl = false, verify_ssl = true 2018-09-14 15:03:33 +0000 [info]: Connecting to database: juniper, host: opennti, port: 8086, username: juniper, use_ssl = false, verify_ssl = true 2018-09-14 15:03:33 +0000 [info]: Connecting to database: juniper, host: opennti, port: 8086, username: juniper, use_ssl = false, verify_ssl = true 2018-09-14 15:03:33 +0000 [info]: listening fluent socket on 0.0.0.0:24224 2018-09-14 15:03:33 +0000 [info]: listening udp socket on 0.0.0.0:50000 2018-09-14 15:03:33 +0000 [info]: listening udp socket on 0.0.0.0:50020 2018-09-14 15:03:33 +0000 [info]: listening dRuby uri="druby://127.0.0.1:24230" object="Engine" 2018-09-14 15:05:09 +0000 [warn]: no patterns matched tag="jnpr.analyticsd" 2018-09-14 15:05:09 +0000 [warn]: no patterns matched tag="jnpr.analyticsd" 2018-09-14 15:05:09 +0000 [warn]: no patterns matched tag="jnpr.analyticsd" 2018-09-14 15:05:09 +0000 [warn]: no patterns matched tag="jnpr.analyticsd" 2018-09-14 15:05:09 +0000 [warn]: no patterns matched tag="jnpr.analyticsd" 2018-09-14 15:05:09 +0000 [warn]: no patterns matched tag="jnpr.analyticsd" 2018-09-14 15:05:09 +0000 [warn]: no patterns matched tag="jnpr.analyticsd" 2018-09-14 15:05:09 +0000 [warn]: no patterns matched tag="jnpr.analyticsd" 2018-09-14 15:05:09 +0000 fluent.warn: {"tag":"jnpr.analyticsd","message":"no patterns matched tag=\"jnpr.analyticsd\""} 2018-09-14 15:05:09 +0000 fluent.warn: {"tag":"jnpr.analyticsd","message":"no patterns matched tag=\"jnpr.analyticsd\""} 2018-09-14 15:05:09 +0000 fluent.warn: {"tag":"jnpr.analyticsd","message":"no patterns matched tag=\"jnpr.analyticsd\""} 2018-09-14 15:05:09 +0000 fluent.warn: {"tag":"jnpr.analyticsd","message":"no patterns matched tag=\"jnpr.analyticsd\""} 2018-09-14 15:05:09 +0000 fluent.warn: {"tag":"jnpr.analyticsd","message":"no patterns matched tag=\"jnpr.analyticsd\""} 2018-09-14 15:05:09 +0000 fluent.warn: {"tag":"jnpr.analyticsd","message":"no patterns matched tag=\"jnpr.analyticsd\""} 2018-09-14 15:05:09 +0000 fluent.warn: {"tag":"jnpr.analyticsd","message":"no patterns matched tag=\"jnpr.analyticsd\""} 2018-09-14 15:05:09 +0000 fluent.warn: {"tag":"jnpr.analyticsd","message":"no patterns matched tag=\"jnpr.analyticsd\""} 2018-09-14 15:05:11 +0000 [warn]: no patterns matched tag="jnpr.analyticsd" 2018-09-14 15:05:11 +0000 fluent.warn: {"tag":"jnpr.analyticsd","message":"no patterns matched tag=\"jnpr.analyticsd\""} 2018-09-14 15:05:13 +0000 [warn]: no patterns matched tag="jnpr.analyticsd" 2018-09-14 15:05:13 +0000 fluent.warn: {"tag":"jnpr.analyticsd","message":"no patterns matched tag=\"jnpr.analyticsd\""} 2018-09-14 15:05:19 +0000 [warn]: no patterns matched tag="jnpr.analyticsd" 2018-09-14 15:05:19 +0000 fluent.warn: {"tag":"jnpr.analyticsd","message":"no patterns matched tag=\"jnpr.analyticsd\""} 2018-09-14 15:05:23 +0000 [warn]: no patterns matched tag="jnpr.analyticsd" 2018-09-14 15:05:23 +0000 fluent.warn: {"tag":"jnpr.analyticsd","message":"no patterns matched tag=\"jnpr.analyticsd\""} 2018-09-14 15:05:29 +0000 [warn]: no patterns matched tag="jnpr.analyticsd" 2018-09-14 15:05:29 +0000 fluent.warn: {"tag":"jnpr.analyticsd","message":"no patterns matched tag=\"jnpr.analyticsd\""} 2018-09-14 15:05:33 +0000 [warn]: no patterns matched tag="jnpr.analyticsd" 2018-09-14 15:05:33 +0000 fluent.warn: {"tag":"jnpr.analyticsd","message":"no patterns matched tag=\"jnpr.analyticsd\""} 2018-09-14 15:05:39 +0000 [warn]: no patterns matched tag="jnpr.analyticsd" 2018-09-14 15:05:39 +0000 fluent.warn: {"tag":"jnpr.analyticsd","message":"no patterns matched tag=\"jnpr.analyticsd\""}

psagrera commented 6 years ago

Hi,

Could you pull latest version of analytics branch and try ?

https://github.com/Juniper/open-nti/commit/01d5eca40014e0e2a5f1c65412202940c66d00d2

Stop everything (make stop), make build + make start

Regards

cbutera-sqsp commented 6 years ago

Ok making progress! @psagrera I am no longer receiving Errors on the opennti_input_jti container. Unfortunately it looks like this data is still not being inserted in the database.

[root@blackhole-test001 open-nti]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES d68992643389 quay.io/influxdb/chronograf:1.5.0.1 "/usr/bin/chronograf…" 25 minutes ago Up 25 minutes 0.0.0.0:8888->8888/tcp chronograf_con c80e9fc2acd6 opennti_input-oc "/entrypoint.sh /sou…" 25 minutes ago Up 25 minutes 8092/udp, 8125/udp, 8094/tcp, 0.0.0.0:50051->50051/udp opennti_input_oc 2bde0fc8cedf opennti_input-snmp "/source/start-input…" 25 minutes ago Up 25 minutes 0.0.0.0:162->162/udp opennti_input_snmp d4fd0cb7dede juniper/open-nti-input-syslog:analytics "/bin/sh -c /home/fl…" 25 minutes ago Up 25 minutes 5140/tcp, 24220/tcp, 24224/tcp, 0.0.0.0:6000->6000/udp opennti_input_syslog b4c441da13f6 juniper/open-nti-input-jti:analytics "/bin/sh -c /fluentd…" 25 minutes ago Up 25 minutes 0.0.0.0:50000->50000/udp, 5000/udp, 24284/tcp, 0.0.0.0:50020->50020/udp opennti_input_jti c6f4b4a9f44c kapacitor:1.5.0 "/entrypoint.sh kapa…" 25 minutes ago Up 25 minutes 0.0.0.0:9092->9092/tcp kapacitor c6327c564c44 juniper/open-nti:analytics "/sbin/my_init" 25 minutes ago Up 25 minutes 0.0.0.0:80->80/tcp, 0.0.0.0:3000->3000/tcp, 0.0.0.0:8083->8083/tcp, 0.0.0.0:8086->8086/tcp, 0.0.0.0:8125->8125/udp opennti_con [root@blackhole-test001 open-nti]# docker logs opennti_input_jti 2018-09-18 20:00:58 +0000 jnpr.analyticsd: {"device":"spine-switch001","interface":"et-0/0/13","type":"traffic-stats.rxbyte","value":543184462743513} 2018-09-18 20:00:58 +0000 jnpr.analyticsd: {"device":"spine-switch001","interface":"et-0/0/13","type":"traffic-stats.rxbps","value":2771900480} 2018-09-18 20:00:58 +0000 jnpr.analyticsd: {"device":"spine-switch001","interface":"et-0/0/13","type":"traffic-stats.rxdroppkt","value":0} 2018-09-18 20:00:58 +0000 jnpr.analyticsd: {"device":"spine-switch001","interface":"et-0/0/13","type":"traffic-stats.rxcrcerr","value":0} 2018-09-18 20:00:58 +0000 jnpr.analyticsd: {"device":"spine-switch001","interface":"et-0/0/13","type":"traffic-stats.txpkt","value":453551665892} 2018-09-18 20:00:58 +0000 jnpr.analyticsd: {"device":"spine-switch001","interface":"et-0/0/13","type":"traffic-stats.txucpkt","value":453551290202} 2018-09-18 20:00:58 +0000 jnpr.analyticsd: {"device":"spine-switch001","interface":"et-0/0/13","type":"traffic-stats.txmcpkt","value":370918} 2018-09-18 20:00:58 +0000 jnpr.analyticsd: {"device":"spine-switch001","interface":"et-0/0/13","type":"traffic-stats.txbcpkt","value":4772} 2018-09-18 20:00:58 +0000 jnpr.analyticsd: {"device":"spine-switch001","interface":"et-0/0/13","type":"traffic-stats.txpps","value":44105} 2018-09-18 20:00:58 +0000 jnpr.analyticsd: {"device":"spine-switch001","interface":"et-0/0/13","type":"traffic-stats.txbyte","value":328120307611265} 2018-09-18 20:00:58 +0000 jnpr.analyticsd: {"device":"spine-switch001","interface":"et-0/0/13","type":"traffic-stats.txbps","value":2453193472}

Database tshoot `[root@blackhole-test001 open-nti]# docker exec -it opennti_con /bin/bash root@c6327c564c44:/# tail -f /var/log/influxdb.log ts=2018-09-18T20:28:14.239330Z lvl=info msg="Executing query" log_id=0Ac_1_EG000 service=query query="SHOW RETENTION POLICIES ON snmp" [httpd] 172.17.0.3 - - [18/Sep/2018:20:28:14 +0000] "POST /query?db=&q=SHOW+RETENTION+POLICIES+ON+snmp HTTP/1.1" 200 149 "-" "KapacitorInfluxDBClient" 5e51c2b1-bb81-11e8-8155-000000000000 720 ts=2018-09-18T20:28:14.240445Z lvl=info msg="Executing query" log_id=0Ac_1_EG000 service=query query="SHOW RETENTION POLICIES ON juniper" [httpd] 172.17.0.3 - - [18/Sep/2018:20:28:14 +0000] "POST /query?db=&q=SHOW+RETENTION+POLICIES+ON+juniper HTTP/1.1" 200 173 "-" "KapacitorInfluxDBClient" 5e51ebed-bb81-11e8-8156-000000000000 309 ts=2018-09-18T20:28:14.241027Z lvl=info msg="Executing query" log_id=0Ac_1_EG000 service=query query="SHOW RETENTION POLICIES ON _internal" [httpd] 172.17.0.3 - - [18/Sep/2018:20:28:14 +0000] "POST /query?db=&q=SHOW+RETENTION+POLICIES+ON+_internal HTTP/1.1" 200 153 "-" "KapacitorInfluxDBClient" 5e52045e-bb81-11e8-8157-000000000000 237 ts=2018-09-18T20:28:14.241450Z lvl=info msg="Executing query" log_id=0Ac_1_EG000 service=query query="SHOW SUBSCRIPTIONS" [httpd] 172.17.0.3 - - [18/Sep/2018:20:28:14 +0000] "POST /query?db=&q=SHOW+SUBSCRIPTIONS HTTP/1.1" 200 241 "-" "KapacitorInfluxDBClient" 5e521620-bb81-11e8-8158-000000000000 227 ts=2018-09-18T20:28:20.003004Z lvl=info msg="Post http://kapacitor_con:9092/write?consistency=&db=_internal&precision=ns&rp=monitor: dial tcp: lookup kapacitor_con on 10.194.1.253:53: no such host" log_id=0Ac_1_EG000 service=subscriber ts=2018-09-18T20:28:30.002529Z lvl=info msg="Post http://kapacitor_con:9092/write?consistency=&db=_internal&precision=ns&rp=monitor: dial tcp: lookup kapacitor_con on 10.194.3.254:53: no such host" log_id=0Ac_1_EG000 service=subscriber ts=2018-09-18T20:28:40.003153Z lvl=info msg="Post http://kapacitor_con:9092/write?consistency=&db=_internal&precision=ns&rp=monitor: dial tcp: lookup kapacitor_con on 10.194.3.254:53: no such host" log_id=0Ac_1_EG000 service=subscriber root@c6327c564c44:/# influx Connected to http://localhost:8086 version 1.5.1 InfluxDB shell version: 1.5.1

show databases name: databases name snmp juniper _internal use juniper Using database juniper show measurements show field keys`

psagrera commented 6 years ago

I forgot to include store section in the fluent.conf file

Try again (same thing, clone / build / start )

https://github.com/Juniper/open-nti/commit/e508c633992d21e40b5b4e1a793e7adc8a2fdaea

cbutera-sqsp commented 6 years ago

Data is being successfully inserted into the database. The issue I am running into now is that for some reason the only 2 types of data in the database are traffic-stats.txcrcerr & queue-stats.queue-depth.

`root@53b75ea8d9ae:/# influx InfluxDB shell version: 1.5.1

show databases name: databases name

snmp juniper _internal

use juniper Using database juniper

SELECT * FROM "juniper"."four_weeks"."jnpr.analyticsd" WHERE "device" = 'spine-sw001' name: jnpr.analyticsd time device interface type value


1537377634000000000 spine-sw001 et-0/0/1 queue-stats.queue-depth 416 1537377635000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0 1537377637000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0 1537377639000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0 1537377641000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0 1537377642000000000 spine-sw001 et-0/0/13 queue-stats.queue-depth 624 1537377643000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0 1537377645000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0 1537377647000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0 1537377649000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0 1537377651000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0 1537377653000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0 1537377655000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0 1537377657000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0 1537377659000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0 1537377661000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0 1537377663000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0 1537377664000000000 spine-sw001 et-0/0/22 queue-stats.queue-depth 624 1537377665000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0 1537377666000000000 spine-sw001 et-0/0/12 queue-stats.queue-depth 208 1537377668000000000 spine-sw001 et-0/0/19 queue-stats.queue-depth 208 1537377670000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0 1537377672000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0 1537377673000000000 spine-sw001 et-0/0/18 queue-stats.queue-depth 208 1537377674000000000 spine-sw001 et-0/0/6 queue-stats.queue-depth 208 1537377676000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0 1537377678000000000 spine-sw001 et-0/0/23 queue-stats.queue-depth 208 1537377680000000000 spine-sw001 et-0/0/23 traffic-stats.txcrcerr 0`

All other types are missing `> SELECT * FROM "juniper"."four_weeks"."jnpr.analyticsd" WHERE "type" = 'traffic-stats.rxbps'

SELECT * FROM "juniper"."four_weeks"."jnpr.analyticsd" WHERE "type" = 'traffic-stats.txbps'`

@psagrera

psagrera commented 6 years ago

Could you please share jti container logs ?

Regards

Pablo

psagrera commented 6 years ago

Could you give it a try with this new commit https://github.com/Juniper/open-nti/commit/98a5ca0a4c25981a8cd68d4190617e8d820951c5 ?

clone / build / start (remove old images just in case). Use "psagrera/fluent-jti analytics-1.2"

Regards

cbutera-sqsp commented 6 years ago

Just tested with "psagrera/fluent-jti analytics-1.2" and I am having the same results. Only receiving queue-stats.queue-depth & traffic-stats.txcrcerr types. @psagrera

`> SELECT * FROM "juniper"."four_weeks"."jnpr.analyticsd" name: jnpr.analyticsd time device interface type value


1537803896000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803898000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803900000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803902000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803904000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803905000000000 spine-sw001 et-0/0/23 queue-stats.queue-depth 1248 1537803906000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803908000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803910000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803912000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803913000000000 spine-sw001 et-0/0/6 queue-stats.queue-depth 1248 1537803914000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803916000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803918000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803920000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803921000000000 spine-sw001 et-0/0/1 queue-stats.queue-depth 208 1537803922000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803924000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803926000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803927000000000 spine-sw001 et-0/0/9 queue-stats.queue-depth 832 1537803928000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803930000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803932000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803934000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803936000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803938000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803940000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803941000000000 spine-sw001 et-0/0/22 queue-stats.queue-depth 1664 1537803942000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803944000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803946000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803948000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803950000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803953000000000 spine-sw001 et-0/0/12 traffic-stats.txcrcerr 0 1537803955000000000 spine-sw001 et-0/0/5 queue-stats.queue-depth 1040`

Below are the logs from the jti container, there are no errors. `[root@blackhole-test001 open-nti]# docker logs opennti_input_jti 2018-09-24 15:43:50 +0000 [info]: reading config file path="/tmp/fluent.conf" 2018-09-24 15:43:50 +0000 [info]: starting fluentd-0.12.43 2018-09-24 15:43:50 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '1.6.0' 2018-09-24 15:43:50 +0000 [info]: gem 'fluent-plugin-udp-native-sensors' version '0.0.1' 2018-09-24 15:43:50 +0000 [info]: gem 'fluentd' version '0.12.43' 2018-09-24 15:43:50 +0000 [info]: gem 'fluentd' version '0.12.42' 2018-09-24 15:43:50 +0000 [info]: adding match pattern="juniperNetworks" type="rewrite_tag_filter" 2018-09-24 15:43:50 +0000 [warn]: rewrite_tag_filter: [DEPRECATED] Use section instead of rewriterule1 2018-09-24 15:43:50 +0000 [info]: adding rewrite_tag_filter rule: rewriterule1 ["sensor_name", /(.+)/, "", "${tag}.$1"] 2018-09-24 15:43:50 +0000 [info]: adding match pattern="jnpr." type="copy" 2018-09-24 15:43:50 +0000 [info]: adding match pattern="juniperNetworks.cpu_memory_util_ext" type="copy" 2018-09-24 15:43:50 +0000 [info]: adding match pattern="juniperNetworks.jnpr_packet_statistics_ext" type="copy" 2018-09-24 15:43:50 +0000 [info]: adding match pattern="juniperNetworks.jnpr_lsp_statistics_ext" type="copy" 2018-09-24 15:43:50 +0000 [info]: adding match pattern="juniperNetworks." type="copy" 2018-09-24 15:43:50 +0000 [info]: adding match pattern="debug." type="stdout" 2018-09-24 15:43:50 +0000 [info]: adding match pattern="fluent." type="stdout" 2018-09-24 15:43:50 +0000 [info]: adding source type="forward" 2018-09-24 15:43:50 +0000 [info]: adding source type="udp" 2018-09-24 15:43:50 +0000 [warn]: 'body_size_limit' parameter is deprecated: use message_length_limit instead. 2018-09-24 15:43:51 +0000 [info]: adding source type="udp" 2018-09-24 15:43:51 +0000 [info]: adding source type="monitor_agent" 2018-09-24 15:43:51 +0000 [info]: adding source type="debug_agent" 2018-09-24 15:43:51 +0000 [info]: using configuration file:

@type forward
@id forward_input

@type udp
tag juniperNetworks
format juniper_udp_native
port 50000
bind 0.0.0.0
body_size_limit 5000

@type udp
tag jnpr.analyticsd
format juniper_analyticsd
message_length_limit 5000
remove_newline false
port 50020
bind 0.0.0.0

@type rewrite_tag_filter rewriterule1 sensor_name (.+) ${tag}.$1

<match jnpr.**> type copy

type influxdb host opennti port 8086 dbname juniper user juniper password xxxxxx value_keys ["value"] buffer_type memory flush_interval 2

type copy type influxdb host opennti port 8086 dbname juniper user juniper password xxxxxx time_precision ms tag_keys ["device","utilization.application_utilization.name","utilization.name"] tag_keys_field key_fields buffer_type memory flush_interval 2 type copy type influxdb host opennti port 8086 dbname juniper user juniper password xxxxxx time_precision ms tag_keys ["device","packet_stats.name"] tag_keys_field key_fields buffer_type memory flush_interval 2 type copy type influxdb host opennti port 8086 dbname juniper user juniper password xxxxxx time_precision ms tag_keys ["device","lsp_stats_records.name"] tag_keys_field key_fields buffer_type memory flush_interval 2

<match juniperNetworks.**> type copy

type influxdb host opennti port 8086 dbname juniper user juniper password xxxxxx time_precision ms tag_keys ["device"] tag_keys_field key_fields buffer_type memory flush_interval 2

@type monitor_agent
@id monitor_agent_input
port 24220

@type debug_agent
@id debug_agent_input
bind 127.0.0.1
port 24230

<match debug.> @type stdout @id stdout_output <match fluent.> @type stdout 2018-09-24 15:43:51 +0000 [warn]: parameter 'value_keys' in type influxdb host opennti port 8086 dbname juniper user juniper password xxxxxx value_keys ["value"] buffer_type memory flush_interval 2 is not used. 2018-09-24 15:43:51 +0000 [info]: Connecting to database: juniper, host: opennti, port: 8086, username: juniper, use_ssl = false, verify_ssl = true 2018-09-24 15:43:51 +0000 [info]: Connecting to database: juniper, host: opennti, port: 8086, username: juniper, use_ssl = false, verify_ssl = true 2018-09-24 15:43:51 +0000 [info]: Connecting to database: juniper, host: opennti, port: 8086, username: juniper, use_ssl = false, verify_ssl = true 2018-09-24 15:43:51 +0000 [info]: Connecting to database: juniper, host: opennti, port: 8086, username: juniper, use_ssl = false, verify_ssl = true 2018-09-24 15:43:51 +0000 [info]: Connecting to database: juniper, host: opennti, port: 8086, username: juniper, use_ssl = false, verify_ssl = true 2018-09-24 15:43:51 +0000 [info]: listening fluent socket on 0.0.0.0:24224 2018-09-24 15:43:51 +0000 [info]: listening udp socket on 0.0.0.0:50000 2018-09-24 15:43:51 +0000 [info]: listening udp socket on 0.0.0.0:50020 2018-09-24 15:43:51 +0000 [info]: listening dRuby uri="druby://127.0.0.1:24230" object="Engine"`

psagrera commented 6 years ago

I've reproduced the issue locally using tcp-replay to simulate QFX packets. Please use the latest version

https://github.com/Juniper/open-nti/commit/5a3adf9b123efc1a22a7098e906315fb16a77457

    > show MEASUREMENTS
    name: measurements
    name
    ----
    /network-instances/network-instance/protocols/protocol/bgp/
    jnpr.analyticsd
    > select * from "jnpr.analyticsd"
    name: jnpr.analyticsd
    time                device                                interface type                    value
    ----                ------                                --------- ----                    -----
    1448780496000000000 s3bu-tme-qfx5100-6.englab.juniper.net et-0/0/48 traffic-stats.rxbcpkt   238206
    1448780496000000000 s3bu-tme-qfx5100-6.englab.juniper.net et-0/0/48 traffic-stats.rxbps     992
    1448780496000000000 s3bu-tme-qfx5100-6.englab.juniper.net et-0/0/48 traffic-stats.rxbyte    70348801
    1448780496000000000 s3bu-tme-qfx5100-6.englab.juniper.net et-0/0/48 traffic-stats.rxcrcerr  0
    1448780496000000000 s3bu-tme-qfx5100-6.englab.juniper.net et-0/0/48 traffic-stats.rxdroppkt 0
    1448780496000000000 s3bu-tme-qfx5100-6.englab.juniper.net et-0/0/48 traffic-stats.rxmcpkt   260497
    1448780496000000000 s3bu-tme-qfx5100-6.englab.juniper.net et-0/0/48 traffic-stats.rxpkt     498703
    1448780496000000000 s3bu-tme-qfx5100-6.englab.juniper.net et-0/0/48 traffic-stats.rxpps     0
    1448780496000000000 s3bu-tme-qfx5100-6.englab.juniper.net et-0/0/48 traffic-stats.rxucpkt   0
    1448780496000000000 s3bu-tme-qfx5100-6.englab.juniper.net et-0/0/48 traffic-stats.txbcpkt   0
    1448780496000000000 s3bu-tme-qfx5100-6.englab.juniper.net et-0/0/48 traffic-stats.txbps     992
    1448780496000000000 s3bu-tme-qfx5100-6.englab.juniper.net et-0/0/48 traffic-stats.txbyte    56770451
cbutera-sqsp commented 6 years ago

That did the trick all set, thank you very much for your help @psagrera Do you plan on merging this?

psagrera commented 6 years ago

Great! I've merged that into the master branch :

https://github.com/Juniper/open-nti/commit/590b95bfb1c16dd3529acb7ff9fed37a0c3bf9e6

https://github.com/Juniper/open-nti/commit/d6356390fa717cf979a01e330f4ce8d434991bcd

Regards,