Juniper / open-nti

Open Network Telemetry Collector build with open source tools
Apache License 2.0
231 stars 93 forks source link

Using input-snmp plugin #224

Closed dccarpenter closed 6 years ago

dccarpenter commented 6 years ago

Similar to #179 I modify the open-nti/plugins/input-snmp/templates/telegraf.tmpl file, enter my system IP addresses in the "agents" field, rebuild and start the container and the addresses have reverted to the default/example IP addresses from the repository.

Can someone shed light on what I'm missing?

I modify the telegraf.tmpl file:

agents = [ "10.13.111.106:161","10.13.111.108:161" ]

Rebuild and restart the container:

dcarpenter@DO-Grafana:~/open-nti$ sudo make build-snmp

Build Docker image - juniper/open-nti-input-snmp:latest

docker build -f plugins/input-snmp/Dockerfile -t juniper/open-nti-input-snmp:latest plugins/input-snmp Sending build context to Docker daemon 6.656kB Step 1/13 : FROM juniper/pyez:2.0.1 ---> e61c159ee89f Step 2/13 : WORKDIR /source ---> Using cache ---> bd97f415b1af Step 3/13 : USER root ---> Using cache ---> d88dd576c716 ---> Using cache ---> 80dfe48c765f Step 8/13 : COPY start-input-snmp.sh /source/start-input-snmp.sh ---> Using cache ---> 2c24b4f62f10 Step 9/13 : RUN chmod +x /source/start-input-snmp.sh ---> Using cache ---> c9dbee7da546 Step 10/13 : RUN mkdir /data ---> Using cache ---> 8818c15ac439 Step 11/13 : ADD templates /data/templates/ ---> Using cache ---> d4417ab74165 Step 12/13 : WORKDIR /data ---> Using cache ---> 5f4cb3b195e0 Step 13/13 : CMD ["/source/start-input-snmp.sh"] ---> Using cache ---> f0107c2cd1be Successfully built f0107c2cd1be Successfully tagged juniper/open-nti-input-snmp:latest dcarpenter@DO-Grafana:~/open-nti$ sudo make restart-snmp IMAGE_TAG=latest docker-compose -f docker-compose.yml restart input-snmp Restarting opennti_input_snmp ...

Check the logs:

dcarpenter@DO-Grafana:~/open-nti$ sudo docker logs 4ceee854c30f

<--clipped-->

2018-07-03T18:24:00Z D! Attempting connection to output: influxdb 2018-07-03T18:24:00Z D! Successfully connected to output: influxdb 2018-07-03T18:24:00Z I! Starting Telegraf (version 1.2.1) 2018-07-03T18:24:00Z I! Loaded outputs: influxdb 2018-07-03T18:24:00Z I! Loaded inputs: inputs.snmp 2018-07-03T18:24:00Z I! Tags enabled: host=open-nti-input-snmp 2018-07-03T18:24:00Z I! Agent Config: Interval:5m0s, Quiet:false, Hostname:"open-nti-input-snmp", Flush Interval:30s 2018-07-03T18:25:10Z E! Error in plugin [inputs.snmp]: agent 172.30.137.90:161: performing get on field hostname: Request timeout (after 3 retries) 2018-07-03T18:25:20Z E! Error in plugin [inputs.snmp]: agent 172.30.137.90:161: gathering table interface_statistics: performing bulk walk for field ifName: Request timeout (after 3 retries) 2018-07-03T18:25:30Z E! Error in plugin [inputs.snmp]: agent 172.30.137.93:161: performing get on field hostname: Request timeout (after 3 retries) 2018-07-03T18:25:30Z D! Output [influxdb] buffer fullness: 0 / 10000 metrics. 2018-07-03T18:25:40Z E! Error in plugin [inputs.snmp]: agent 172.30.137.93:161: gathering table interface_statistics: performing bulk walk for field ifName: Request timeout (after 3 retries) 2018-07-03T18:26:00Z D! Output [influxdb] buffer fullness: 0 / 10000 metrics.

Please let me know what other information you may need.

psagrera commented 6 years ago

Could you please verify if with this commit https://github.com/Juniper/open-nti/commit/2b1ec93143d29f2690ca3682a63c50848316254a, your issue is solved ?

You will have to pull latest version and stop/start containers

Regards

sohamdshah commented 6 years ago

This still doesn't work. Despite removing agent 10.102.186.0:161 completely, the logs continue to show errors. Could you please help?

psagrera commented 6 years ago

Could you please share logs and telegraf config ?

sohamdshah commented 6 years ago

----docker ps ----- ** (master *) open-nti $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 3973003d7626 quay.io/influxdb/chronograf:1.5.0.1 "/usr/bin/chronograf…" 33 seconds ago Up 30 seconds 0.0.0.0:8888->8888/tcp chronograf_con 4daf7aec569f open-nti_input-snmp "/source/start-input…" 33 seconds ago Up 30 seconds 0.0.0.0:162->162/udp opennti_input_snmp 05ab6bd85730 juniper/open-nti-input-jti:latest "/bin/sh -c /fluentd…" 33 seconds ago Up 31 seconds 0.0.0.0:50000->50000/udp, 24284/tcp, 0.0.0.0:50020->50020/udp opennti_input_jti 646a876956f4 open-nti_input-oc "/entrypoint.sh /sou…" 33 seconds ago Up 30 seconds 8092/udp, 8125/udp, 8094/tcp, 0.0.0.0:50051->50051/udp opennti_input_oc d96531a09ed7 kapacitor:1.5.0 "/entrypoint.sh kapa…" 33 seconds ago Up 31 seconds 0.0.0.0:9092->9092/tcp kapacitor fc9e20185bb2 juniper/open-nti-input-syslog:latest "/bin/sh -c /home/fl…" 33 seconds ago Up 31 seconds 5140/tcp, 24220/tcp, 24224/tcp, 0.0.0.0:6000->6000/udp opennti_input_syslog 977b6ae946bf juniper/open-nti:latest "/sbin/my_init" 33 seconds ago Up 32 seconds 0.0.0.0:80->80/tcp, 0.0.0.0:3000->3000/tcp, 0.0.0.0:8083->8083/tcp, 0.0.0.0:8086->8086/tcp, 0.0.0.0:8125->8125/udp opennti_con

---- Logs -----

** (master *) open-nti $ docker logs 4daf7aec569f 2018-08-20T17:34:07Z D! Attempting connection to output: influxdb 2018-08-20T17:34:07Z D! Successfully connected to output: influxdb 2018-08-20T17:34:07Z I! Starting Telegraf v1.7.0 2018-08-20T17:34:07Z I! Loaded inputs: inputs.snmp 2018-08-20T17:34:07Z I! Loaded aggregators: 2018-08-20T17:34:07Z I! Loaded processors: 2018-08-20T17:34:07Z I! Loaded outputs: influxdb 2018-08-20T17:34:07Z I! Tags enabled: host=open-nti-input-snmp 2018-08-20T17:34:07Z I! Agent Config: Interval:5m0s, Quiet:false, Hostname:"open-nti-input-snmp", Flush Interval:30s 2018-08-20T17:35:09Z E! Error in plugin [inputs.snmp]: agent 10.102.186.0:161: performing get on field hostname: Request timeout (after 3 retries) 2018-08-20T17:35:19Z E! Error in plugin [inputs.snmp]: agent 10.102.186.0:161: gathering table interface_statistics: performing bulk walk for field ifName: Request timeout (after 3 retries) 2018-08-20T17:35:30Z D! Output [influxdb] buffer fullness: 0 / 10000 metrics. 2018-08-20T17:36:00Z D! Output [influxdb] buffer fullness: 0 / 10000 metrics. ** (master *) open-nti $

---telegraph----

[tags]

dc = "open-nti"

[agent] interval = "5m" round_interval = true flush_interval = "30s" flush_jitter = "0s" debug = true hostname = "open-nti-input-snmp"

[[outputs.influxdb]] urls = ["http://opennti:8086"] database = "snmp" precision = "s" retention_policy = "" timeout = "5s"

[[inputs.snmp]] agents = [ "10.13X.X5.4X:161" ] version = 2 community = "public" name = "system"

[[inputs.snmp.field]] name = "hostname" oid = ".1.3.6.1.2.1.1.5.0" is_tag = true [[inputs.snmp.field]] name = "uptime" oid = ".1.3.6.1.2.1.1.3.0" [[inputs.snmp.field]] name = "sysObjectID" oid = ".1.3.6.1.2.1.1.2.0"

[[inputs.snmp.table]] name = "interface_statistics" inherit_tags = [ "hostname" ] [[inputs.snmp.table.field]] name = "ifName" oid = ".1.3.6.1.2.1.31.1.1.1.1" is_tag = true [[inputs.snmp.table.field]] name = "ifHCInOctets" oid = ".1.3.6.1.2.1.31.1.1.1.6" [[inputs.snmp.table.field]] name = "ifHCOutOctets" oid = ".1.3.6.1.2.1.31.1.1.1.10" [[inputs.snmp.table.field]] name = "ifHCInMulticastPkts" oid = ".1.3.6.1.2.1.31.1.1.1.8" [[inputs.snmp.table.field]] name = "ifHCOutMulticastPkts" oid = ".1.3.6.1.2.1.31.1.1.1.12" [[inputs.snmp.table.field]] name = "ifHCInBroadcastPkts" oid = ".1.3.6.1.2.1.31.1.1.1.9" [[inputs.snmp.table.field]] name = "ifHCOutBroadcastPkts" oid = ".1.3.6.1.2.1.31.1.1.1.13" [[inputs.snmp.table.field]] name = "ifHCOutUcastPkts" oid = ".1.3.6.1.2.1.31.1.1.1.11" [[inputs.snmp.table.field]] name = "ifHCInUcastPkts" oid = ".1.3.6.1.2.1.31.1.1.1.7"

[inputs.snmp.tagpass] ifName = [ "[g|x]e-", "ae" ]

I have attached the file as well which has all the outputs. Note that 10.102.186.0:161 is not the IP address of my target device being polled. open-nti logs.docx

3fr61n commented 6 years ago

Hi,

It seems that the image you're running has the default config file, instead the one you're showing here.

Usually when building a new image it overrides the old one.

If the problem persists after building the image, I suggest to first delete the old image and then build a new image.

sohamdshah commented 6 years ago

I have deleted previous images, built/restarted containers, verified my output through snmpwalk but open-nti just doesnt seem to work. I have included "snmp" in the telegraf.tmpl as per https://github.com/influxdata/telegraf/issues/2628 but nothing works. Wireshark shows SNMP get requests to 10.102.186.0 from my host machine, instead of going to the switch. Can you tell me what am I missing?

psagrera commented 6 years ago

Hi,

I did a couple of commits few minutes ago. Could you please pull latest version a give it a try ? Delete old snmp-images.

https://github.com/Juniper/open-nti/commit/120941475b59929eccaa96d5538d1c0c90de2268

https://github.com/Juniper/open-nti/commit/ceb183aa65094e0822167a425830d85536702a55

https://github.com/Juniper/open-nti/commit/b49c3567b28f5e1d1e00cf417e84f88f286cca25

Regards

dccarpenter commented 6 years ago

Thanks for the update. Sorry for the newbie question, what is the correct way to delete the old snmp-image?

psagrera commented 6 years ago

Hi,

1) make stop 2) docker images

 i.e 
 root@ubuntu:~/open-nti# docker images
 REPOSITORY                        TAG                 IMAGE ID            CREATED             SIZE
  juniper/open-nti-input-snmp       latest              855857a349be        34 hours ago        332MB
 open-nti_input-snmp               latest              c33b51f26fdd        34 hours ago        332MB
 open-nti_input-oc                 latest              4192ff73b7e2        2 days ago          2.13GB
  juniper/open-nti-input-oc         latest              0758e8a13521        2 days ago          2.13GB
 [.....]

3) docker rmi "IMAGE ID" (in my example 855857a349be and c33b51f26fdd)- In case you get a message saying something like: " image cannot be delete because is being used by stopped container XXXXXXXX", then execute docker rm XXXXXXXX and then docker rmi "IMAGE ID" 4) make build-snmp 5) Modify telegraf.tmpl file under ~/plugins/input-snmp/templates/ with the proper agents info 5) make start

Regards

dccarpenter commented 6 years ago

This is working well now. Thanks for the instructions and updates.