Juniper / open-nti

Open Network Telemetry Collector build with open source tools
Apache License 2.0
233 stars 93 forks source link

commands.yaml not being correctly inserted/parsed into influxdb #173

Open vascojcm opened 7 years ago

vascojcm commented 7 years ago

Hello,

First of all thank you so much for the development of this really useful tool. I'm working and configuring open-nti since the last week and I'm facing a really strange issue. I already created some new parsers and commands adapted into my network environment and what I see is that some data from the juniper routers is collected and other don't.

for instance, on my command.yaml file I have the following 2 commands: show interfaces queue ge-0/0/0 | display xml show interfaces queue ge-0/0/1 | display xml

I have two parser files created for both commands. (show-interface-queue-ge000.yaml and show-interface-queue-ge001.yaml) " parser: regex-command: show\s+interfaces\s+queue\s+ge-0\/0\/0\s+|\s+display\s+xml matches:

    type: multi-value
    method: xpath
    xpath: //queue
    loop:
        key: ./forwarding-class-name
        sub-matches:
        -
            xpath: ./queue-counters-red-packets
            variable-name:  $host.interface.ge-0/0/0.$key.scheduler-drops-pkts

" (for the ge-0/0/1 interface I've just replaced the last 0 for 1). What happens is that I only get data collection for the interface ge-0/0/1.

Even more strange...if I remove the "show interfaces queue ge-0/0/1 | display xml" and leave the "show interfaces queue ge-0/0/0" in commands.yaml it is still populating the ge-0/0/1 data!

If I do a "make cron-debug TAG=myTag"

[host2]: Parser found and processed, going to next comand. [host1]: Parser found and processed, going to next comand. [host2]: Executing command: show interfaces queue ge-0/0/0 | display xml [host1]: Executing command: show interfaces queue ge-0/0/0 | display xml [host2]: Parser found and processed, going to next comand. [host1]: Parser found and processed, going to next comand. (...) {'fields': {'delta': 0, 'value': 0}, 'measurement': 'host1.interface.ge-0/0/1.best-effort.scheduler-drops-pkts', 'tags': {'device': 'host1', 'key': 'best-effort', 'kpi': 'host1.interface.ge-0/0/1.best-effort.scheduler-drops-pkts', 'product-model': 'mx104', 'version': '16.1R3-S3.1'}}, {'fields': {'delta': 0, 'value': 0}, 'measurement': 'host1.interface.ge-0/0/1.expedited-forwarding.scheduler-drops-pkts', 'tags': {'device': 'host1', 'key': 'expedited-forwarding', 'kpi': 'host1.interface.ge-0/0/1.expedited-forwarding.scheduler-drops-pkts', 'product-model': 'mx104', 'version': '16.1R3-S3.1'}}, {'fields': {'delta': 0, 'value': 0}, 'measurement': 'host1.interface.ge-0/0/1.assured-forwarding.scheduler-drops-pkts', 'tags': {'device': 'host1', 'key': 'assured-forwarding', 'kpi': 'host1.interface.ge-0/0/1.assured-forwarding.scheduler-drops-pkts', 'product-model': 'mx104', 'version': '16.1R3-S3.1'}}, {'fields': {'delta': 0, 'value': 0},

Also, when I create new graphs and insert new commands into commands.yaml file, some data collection that was happening before just stop to being collected and inserted into the influxdb. I'm talking about some legacy commands that was coming already with the open-nti default container. (show bgp summary or show pfe statistics traffic, for instance)

I've already changed this parameters on the open-nti.variables.yaml, but the result is the same...so apparently we're not talking about a problem excessive data gathering.... " max_collector_threads: 100 delay_between_commands: 5 max_connection_retries: 5 " Could you please help me trying to figure out what is happening ? I've already spend a lot of time around this and I simply can't understand why this isn't working properly.

Please find attach both parser files just to see how they are a exact match, changing only the interface and also the open-nti.variables file.

Thank you so much. Best Regards

show-interfaces-queue.gezero.parser.txt show-interfaces-queue.geone.parser.txt show-interfaces-queue.ge012.parser.txt open-nti.variables.txt

vascojcm commented 7 years ago

Hi again,

I was already able to understand what is happening and it seems clearly to be a bug.

If I repeat similar commands on commands.yaml the data collection will be simply gathered on one of the commands. Example: I have something like this on my commands.yaml file :

generic_router_commands: commands: | show bgp summary | display xml show chassis routing-engine | display xml show pfe statistics traffic | display xml show system processes extensive show system buffers show system statistics icmp | display xml show route summary table inet.0 | display xml show route summary table bgp.l2vpn.0 | display xml show interfaces media | display xml show interfaces queue ge-0/0/0 | display xml show interfaces queue ge-0/0/1 | display xml show interfaces queue ge-0/1/2 | display xml tags: all

As you can see the last 3 commands are similar, but for different interfaces. Although I have parsing files for each one of the interfaces the data insertion into DB only happens to one of the commands. If I leave only one command that command is correctly parsed and the data is written into the influxDB.

I've shared the parsers files on the post above so you can see how it is being done. If there is a way to solve this please let me know.

Thanks

3fr61n commented 7 years ago

Hi

Could you please try to change the regex?

I was doing a quick test with the regex and I found this

screen shot 2017-08-23 at 09 42 38

But escaping the / perhaps could do the job

screen shot 2017-08-23 at 09 41 34

Let me know if this works for you Regards

3fr61n commented 7 years ago

Does it works ?

Regards

vascojcm commented 7 years ago

sorry not giving you any feedback at this point.

unfortunately I'm not able to properly test this.

I will let you know once I can do it because this is definitely a thing that I really need to have in place on my system.

thanks

vascojcm commented 7 years ago

hi @3fr61n,

Sorry for not giving you feedback early related to this issue.

This is not working yet.

on my commands.yaml file I have:

generic_commands: commands: | show bgp summary | display xml (...) show interfaces queue ge-0/0/0 | display xml show interfaces queue ge-0/0/1 | display xml (...)

i also have 2 different files created in junos_parser folder for each one of the commands.

What is happening is that only the last "show interfaces queue*" command is being populated on the influxdb (which in this example is the show interfaces queue ge-0/0/1)

Only if I take off this command the "show interfaces queue ge-0/0/0" starts to be populated.

when I do "make cron-debug TAG='MY_TAG'" there is no error being presented which means that the parser is well written. the problem is that it only populates one of the "show interfaces queue" command at a time (the last one usually).

This means that I can have a commands.yaml file like this:

generic_commands: commands: | show bgp summary | display xml (...) show interfaces queue ge-0/0/0 | display xml show interfaces queue ge-0/0/1 | display xml show interfaces queue ge-0/0/2 | display xml show interfaces queue ge-0/0/3 | display xml show interfaces queue ge-0/0/(N) | display xml (...)

Only the "n"th command will be populated in influxdb, though there are parsers correctly defined for all the commands. If I take off the "n"th command, only the "show interfaces queue ge-0/0/3" will be populated. This only happens for this situation where the show commands are really similar. All the other show commands are being correctly populated.

Hope you understand the issue. Best Regards

vascojcm commented 6 years ago

Hi @3fr61n,

Don't know if you are still working on this project or not. I'm running again with an issue related to the commands.yaml.

I'm using the "show snmp statistics" parser to populate the DB and consequently the grafana dashboards. However no metrics are being pushed to InfluxDB! The debug shows also that no metric is being send, though, my router send values related to that query.

I have also the commands.yml, hosts.yml and the credentials.yml correctly configured since I have a lot of metrics being correctly collected.

docker exec -i -t opennti_con /usr/bin/python /opt/open-nti/open-nti.py -s -c --tag SNMP Collector Thread-1 scheduled with following hosts: ['my_router'] Connecting to host: my_router [my_router]: Executing command: show version | display xml [my_router]: Host will now be referenced as : my_router [my_router]: Executing command: show snmp statistics | display xml [my_router]: Parser found and processed, going to next comand. [my_router]: timestamp_tracking - CLI collection 4 Inserting into database the following datapoints: [{'fields': {'delta_str': 'N/A', 'value_str': 'mx104'}, 'measurement': 'base-info', 'tags': {'device': 'my_router', 'kpi': 'base-info', 'product-model': 'mx104', 'version': '16.1R3-S3.1'}}, {'fields': {'delta': 7, 'value': 7}, 'measurement': 'open-nti-stats', 'tags': {'device': 'my_router', 'kpi': 'open-nti-stats', 'product-model': 'mx104', 'stats': 'collection-time', 'version': '16.1R3-S3.1'}}, {'fields': {'delta': 1, 'value': 1}, 'measurement': 'open-nti-stats', 'tags': {'device': 'my_router', 'kpi': 'open-nti-stats', 'product-model': 'mx104', 'stats': 'collection-successful', 'version': '16.1R3-S3.1'}}] [my_router]: timestamp_tracking - total collection 7

$ ssh my_router "show snmp statistics | display xml " | grep packets

23995215
        <packets>24113243</packets>

$ ssh my_router "show snmp statistics | display xml " | grep get-next

519394
        <get-nexts>0</get-nexts>

Can you possibly know what is going on here ?

Thanks

3fr61n commented 6 years ago

Hi @vascojcm

It's weird what's happening to you, because I'm using same parser on my setup and it works...

I can see you're using 16.1R3-S3.1 in a mx104, but could you please attach the "show snmp statistics | display xml" in order to check if the xpaths are still valid

Regards Efrain

vascojcm commented 6 years ago

Thanks for your prompt answer!

From my last 2 hours troubleshoot I guess the xpaths are still the same. :)

now with the output attached in file

. tshoot_snmp.txt

vascojcm commented 6 years ago

Please let me know if you are able to reproduce this issue

3fr61n commented 6 years ago

show snmp statistics | display xml, seems not complete, could you please attach it again from to

Regards

vascojcm commented 6 years ago

tshoot_snmp.txt

vascojcm commented 6 years ago

now I believe is full. best regards

3fr61n commented 6 years ago

I'll try to test it during this week, I'll let you know, but meanwhile could you test this xpath in order to discard namespaces issues?

Modify this xpath

//snmp-input-statistics/packets

with this one

//*[local-name() = 'snmp-input-statistics']/*[local-name() = 'packets']

Regards

vascojcm commented 6 years ago

Thanks @3fr61n ,

I tried already with the same result unfortunately. `# cat data/junos_parsers/show-snmp-statistics.parser.yaml parser: regex-command: show\s+snmp\s+statistics\s+|\s+display\s+xml matches:

    type: single-value
    method: xpath

xpath: //snmp-input-statistics/packets

    xpath: //[local-name() = 'snmp-input-statistics']/[local-name() = 'packets']
    variable-name:  $host.snmp-input-packets
-
    type: single-value
    method: xpath

xpath: //snmp-input-statistics/get-nexts

    xpath: //[local-name() = 'snmp-input-statistics']/[local-name() = 'get-nexts']
    variable-name:  $host.snmp-input-get-nexts
-
    type: single-value
    method: xpath

xpath: //snmp-output-statistics/packets

    xpath: //[local-name() = 'snmp-output-statistics']/[local-name() = 'packets']
    variable-name:  $host.snmp-output-packets

`

3fr61n commented 6 years ago

You miss all * in the xpath, could you please retry :)

vascojcm commented 6 years ago

Tried again,

still no luck

cat data/junos_parsers/show-snmp-statistics.parser.yaml parser: regex-command: show\s+snmp\s+statistics\s+|\s+display\s+xml matches:

    type: single-value
    method: xpath
    xpath: //\*[local-name() = 'snmp-input-statistics']/\*[local-name() = 'packets']
    variable-name:  $host.snmp-input-packets
-
    type: single-value
    method: xpath
    xpath: //\*[local-name() = 'snmp-input-statistics']/\*[local-name() = 'get-nexts']
    variable-name:  $host.snmp-input-get-nexts
-
    type: single-value
    method: xpath
    xpath: //\*[local-name() = 'snmp-output-statistics']/\*[local-name() = 'packets']
    variable-name:  $host.snmp-output-packets

docker exec -i -t opennti_con /usr/bin/python /opt/open-nti/open-nti.py -s -c --tag SNMP Collector Thread-1 scheduled with following hosts: ['my_router'] Connecting to host: my_router [my_router]: Executing command: show version | display xml [my_router]: Host will now be referenced as : my_router [my_router]: Executing command: show snmp statistics | display xml [my_router]: Parser found and processed, going to next comand. [my_router]: timestamp_tracking - CLI collection 4 Inserting into database the following datapoints: [{'fields': {'delta_str': 'N/A', 'value_str': 'mx104'}, 'measurement': 'base-info', 'tags': {'device': 'my_router', 'kpi': 'base-info', 'product-model': 'mx104', 'version': '16.1R3-S3.1'}}, {'fields': {'delta': 7, 'value': 7}, 'measurement': 'open-nti-stats', 'tags': {'device': 'my_router', 'kpi': 'open-nti-stats', 'product-model': 'mx104', 'stats': 'collection-time', 'version': '16.1R3-S3.1'}}, {'fields': {'delta': 1, 'value': 1}, 'measurement': 'open-nti-stats', 'tags': {'device': 'my_router', 'kpi': 'open-nti-stats', 'product-model': 'mx104', 'stats': 'collection-successful', 'version': '16.1R3-S3.1'}}] [my_router]: timestamp_tracking - total collection 7