roboconf / roboconf-platform

The core modules and the platform
Apache License 2.0
35 stars 11 forks source link

Specification for the monitoring support & template #232

Closed bourretp closed 9 years ago

bourretp commented 9 years ago

Issue #155 adds a new feature (Monitoring support). This feature relies on a template engine which exports an application configuration to external files. Those files can then feed monitoring tools (e.g. Nahios, Shinken, Cacti, ...).

The specification of a new template language is thus needed, before work can start on #155.

vincent-zurczak commented 9 years ago

So far, so good. How about the use of environment variables in such template files? Does it make sense or not? At the moment, I have no idea.

bourretp commented 9 years ago

It may be simple to include DM's env variables: ${$JAVA_HOME}. You may have a look at the latest wiki changes: https://github.com/roboconf/roboconf-platform/wiki/MonitoringTemplateLanguage/_compare/bf6ec62%5E...bf6ec62 .

Of course environments of agents need a specific monitor infrastructure that propagates their key:value pairs to the DM. But I suppose that's another story/issue...

vincent-zurczak commented 9 years ago

Of course environments of agents need a specific monitor infrastructure that propagates their key:value pairs to the DM. But I suppose that's another story/issue...

I am not sure to understand this statement. You cannot propagate anything to an agent without the DM.

For instance an application named MyApp will have its generated monitoring files located in the MyApp/monitoring-generated/

I am not fond of this option. For me, it should be placed under monitoring-generated/MyApp/. We would like to keep the folder's structure simple and readable. With your solution, it would quickly become complex as the applications number would increase.

bourretp commented 9 years ago

point 1: what I mean is that maybe we want to also get the environment variables of the agents. for now, only the DM env vars are accessible. Getting the agent env var needs (AFAIK) something more: an agent-side collector that send, along with the agent configuration, its env vars. then those env vars could be used in the template, e.g. ${/path/to/agent$JAVA_HOME}.

point 2: I thought the Roboconf application configuration directory was organized on a per-application basis. But, after reading your comment, it seems not ;) Ok for the directory structure you propose. I update the spec ASAP.

bourretp commented 9 years ago

point 2 just fixed!

bourretp commented 9 years ago

Specification has just been updated. https://github.com/roboconf/roboconf-platform/wiki/MonitoringTemplateLanguage @roboconf/developers @roboconf/owners: Please feel free to comment/add content!

Maybe we could cast a vote after spec stabilization...

vincent-zurczak commented 9 years ago

Hi,

My comments...

The operators/monitoring tools are then able to read those files, using their native input format, and to act accordingly.

Monitoring tools can read using their native input format, or, they can also transform the read input into configuration files. Configuration files may need information that only the monitoring tool may know, unlike Roboconf. So, both cases are realistic.

The application package can embed one or several monitoring templates, located in the /monitoring-template archive directory.

I am not sure it is worth adding complexity to our archives. IMO, monitoring templates are about exploitation environments and not about what the development team should provide. The application's archive is the output of the development team. Templates is about how the infrastructure is monitored. So, we should only consider the case where they are under the DM's configuration directory. It will also reduce our development effort. And if somebody needs it some day, we will implement it at the moment.

# We only want hosts
{
   "name":"VM1"
   "ip":"192.168.1.18"
}

{
   "name":"VM 4"
   "ip":"192.168.1.11"
}

And the second output we may want...

// We want hosts and web servers
// Hosts
{
   "name":"VM1"
   "ip":"192.168.1.18"
   "probe-name":"p.vm"
}

{
   "name":"VM 4"
   "ip":"192.168.1.11"
   "probe-name":"p.vm"
}

// Web servers
{
    "ip":"192.168.1.11"
    "port":"80"
    "probe-name":"p.ws"
    "my-param":"its-value"
}

Given these examples, how would I write the templates?

bourretp commented 9 years ago

Thanks for your feedback! I just updated the wiki page and I think I've addressed the issues you raised.

Just one thing about the last example: probes are not supported by the current specification. For sure we can add more instructions to support them, but I don't really know what data you are referring to. Could you add the probe-instructions to the table ?

vincent-zurczak commented 9 years ago

In this scenario, as a sys admin, probes is specific to my monitoring tool. It has nothing to do with Roboconf. I do not expect Roboconf to know it.

I will comment your changes later.

vincent-zurczak commented 9 years ago

In the examples, you said that to generate this...

"MySQL": {
  "ip": "192.168.1.63",
  "port": 3306
},

"Tomcat 1": {
  "ip": "192.168.1.18",
  "portAJP": 8009
},

"Tomcat 2": {
  "ip": "192.168.1.11",
  "portAJP": 8009
}

... the template had to look like...

"MySQL": {
  "ip": "${/Mysql VM/Mysql#ip}",
  "port":${/Mysql VM/Mysql#port}
},

"Tomcat 1": {
  "ip": "${/Tomcat VM 1/Tomcat#ip}",
  "portAJP": ${/Tomcat VM 1/Tomcat#portAJP}
},

"Tomcat 2": {
  "ip": "${/Tomcat VM 2/Tomcat#ip}",
  "portAJP": ${/Tomcat VM 2/Tomcat#portAJP}
}

I agree this MUST be possible. However, with this solution, I would have to update my monitoring template every time I add or remove an instance. It means that no matter which action is performed in the DM's web administration, I have to update my templates by hand. And this is exactly what we want to avoid. In fact, I am not supposed to know which instances I have. And I am not sure of their states. As an example, if a VM is stopped, then maybe it should not be added to monitoring data.

Here is what I would expect as a template (I took a PHP / JSP-like syntax but it does not matter).

<%
foreach( VM as vm ) {    // VM is a component name
      if( VM.status == DEPLOYED_STARTED ) {
%>

"vm.name": {
    "ip": "${vm.ip}"
}
<%
      }
}
%>

Given this constraint, taking a look at Mustache may help. We already use it in the Roboconf's plugin API.

And obviously, in the case where I know exactly what instances I want to monitor, I should also be able to use a stricter template like you submitted. :)

vincent-zurczak commented 9 years ago

I have just read your last update. I almost agree with your example. The only thing I would change is the #rootInstance thing. I am not sure of what you meant. IMO, you could use any component name.

{
  "rootInstances": [
    {{#VM}}
    {
      "path":   "{{path}},
      "status": "{{status}}",
      "ip":     "{{ip}}"
    },
    {{/VM}}
  ]
}

And that would generate the template for all the VM instances (the instances whose component is VM). We should not distinguish root instances from others. We should really rely on the component name. Unlike instances, components do not evolve at runtime. The graph is fixed. So, it is a robust basis for templating. And it allows to support any kind of instances.

So, the real question is about how to populate the Mustache context. And how to deal with conditional statements, since Mustache does not support them. We could take a convention in Roboconf and context creation to only insert started and/or deployed instances.

I suggest you try an exemple with...

How does this constrain the Mustache template?

vincent-zurczak commented 9 years ago

My comments about the last version of the spec.

instanceOf

We used to have such a keyword in Roboconf. We finally replaced it by instance of. I guess we cannot have a split keyword in Mustache (or one of its extensions). I suggest we use all instead.

{#all VM}

{{parent.data.ip}}

I really think we should hide as much complexity as we can for users. Remember, configuring monitoring templates aims at being used by system administrators, not developers. I think the parent navigation is useless (but I can be wrong). It makes a strong assumption about instance hierarchy. In the same way, I do not know whether accessing instance's data is necessary.

In fact, navigating in the graph is not the most important. What really matters is about searching by type names (component or facet names). We can consider that the ip property (and all the other network informtion) can be inherited for all the instances from their root one.

Configuration of all instances of a specific component.

Good.

Nesting instance selections

Interesting.

To summer it up... To access application's data, I do not see the interest of having rootComponents and rootInstances. Components are only used to find instances. Besides, if we wanted to iterate over instances of a given application, we could use what you described in Nesting instance selections. And if wanted to be complete, we should also have a way to find facets from the application. So, the application context should only keep the following (simple) attributes: name, namespace and qualifier.

About instances, I would remove data, as this is really an internal mechanism. System administrators are not supposed to know about it. This is a developer trick, not a real Roboconf concept.

And ip is a property we will propagate to all the instances, so that it is easy to access it. No need to access the root instance (in the template!) to find it.

Eventually, you did not mention facets in your document. Iteration should work with both components and facets. Thus...

{#all my-component}

... is as much valid as...

{#all my-facet}

Apart these little notices, I think you can start coding it. :+1:

bourretp commented 9 years ago

Monitoring support was added since 310c225. Because of technical issues, the forementioned specification differs considerably from the monitoring implementation. Besides that, there are a few fixes/improvements that need to be resolved (#303, #304, #305 & #306) before the final language gets ready.

Thus, the writing of the final specification is postponed until those issues are fixed. No idea if it is feasible before the v0.4 feature freeze...

vincent-zurczak commented 9 years ago

I archived the wiki page about this. Up to date documentation will now be on the web site.