Juniper / open-nti

Open Network Telemetry Collector build with open source tools
Apache License 2.0
232 stars 93 forks source link

lw4o6 dashboard problem and replacement #162

Open kzorba opened 7 years ago

kzorba commented 7 years ago

The lw4o6 Grafana dashboard included in the distribution (lw4o6.json) does not work out of the box. It is actually a template that needs to be imported, so as to define its data source. In a next pull request, I will submit a replacement dashboard working out of the box. It is a bit different than the original but works good for us. It includes relevant information for a lw4o6 deployment without extras and has parametrized panels. I discussed this with @mwiget and he seems to approve it as well.

3fr61n commented 7 years ago

I took a look to the previous and actual dashboard and I think we could work together to improve it even more.

For instance.

1.- What about keep this dashboard with only Iw4o6 information, so the could remove the non-lw4o6 charts (like RE processes, PFE stats, etc etc)

2.- Add templating to Iw4o6 charts (and avoid having fixed values for some variables like interfaces)

3.- Checking the parser we already have, it seems there are some variables that does not have charts yet, (I'm not an expert on Iw4o6 environment), but perhaps it could be worthily to add them.

What do you think?

mwiget commented 7 years ago

@3fr61n re 1) There is value in having a dashboard showing combined data, so it can be left open and provide an overview. Splitting it up into dashboards per function will make this harder. Maybe there is a middle ground?

3fr61n commented 7 years ago

So do you think is not ok to having 2 windows (with different dashboards for different content)?

I suggested to have a specific dashboard for only Iw4o6 basically because it would ease the template generation, with variables specific for that environment, and ease the data analysis

When you have too many different content in you dashboard this topic get seriously complex. (we are facing this on the streaming dashboard)

Perhaps we could begin with (2) and (3), then later we could go back to (1) and then decide

Regards

kzorba commented 7 years ago

I agree with @mwiget on the dashboard with combined data. The dashboard I sent will be used in our production lw4o6 deployment, so I think it is good to have other stuff in like BGP monitoring, Routing Table metrics, RE and/or PFE engine basic measurements. Another change compared to the original is the use of derivative() influxdb function.

On the templating I also agree, I included a graph called "lwAFTR bbs/pps parametrized". Of course my approach is far from perfect, a lot of improvements could be made, also other stuff might be needed and I have not foreseen it. For example the interfaces should be automatically generated somehow, for now I use 2 explicit values because our lwAFTRs have a couple of interfaces each.

In any case, a working dashboard for lw4o6 should be included in open-nti, the current included template causes some confusion. I will test any suggestions :)