ManageIQ / manageiq-providers-hawkular

ManageIQ plugin for the Hawkular provider
8 stars 33 forks source link

Fix problem getting live metrics when trigger is created with Datasource and Mesagging types (bugzilla 1505343) #88

Closed aljesusg closed 6 years ago

aljesusg commented 6 years ago

When alerts type Datasource and Mesagging are create in hawkular conditions and dataID is nil, the problem is getting the livemetrics of these kinds of alerts.

Fix Alerts of https://bugzilla.redhat.com/show_bug.cgi?id=1505343

aljesusg commented 6 years ago

@miq-bot add_label wip, bug

gbaufake commented 6 years ago

@aljesusg Does https://github.com/ManageIQ/manageiq-providers-hawkular/pull/9 impact on this PR?

aljesusg commented 6 years ago

I don't think @gbaufake this changes are in alert manager

gbaufake commented 6 years ago

@aljesusg Just a sanity check because you've said the #9 might correct the bugzilla 1505343. Considering booth are on open state, I wanted to make sure one PR would not impact on the other.

Since there is no cross-reference between them, I am in favor of merging it after someone reviews it . After that, I can proceed on bugzilla verification

aljesusg commented 6 years ago

No @gbaufake it was a mistake I was wrong about that there are some changes in the provider like #9 but I didn't know what was the problem, I am new with hawkular and the investigation takes me so time. This PR don't impact with #9 and the error was in my side with live metrics.

gbaufake commented 6 years ago

@aljesusg Thanks for the clarifications!

abonas commented 6 years ago

@aljesusg please have a more informative commit message and description. BZ number is good and needed but some more explanations are needed as well. thanks When wanting to review the PR I am missing information what is the source of the problem in the code, how it is fixed, was this a regression or not, etc.

aljesusg commented 6 years ago

Did it @abonas !!! Thanks and sorry for the short definition.

abonas commented 6 years ago

@lucasponce only if you have time, I know you are well familiar with this area

aljesusg commented 6 years ago

@miq-bot remove_label wip

aljesusg commented 6 years ago

@lucasponce I remove the native metrics name with another tests, now I think is fine. @abonas I am still working in the alerts, I have a call today with Edgar to analyze the format of the json, anyway we'll need this PR because live metrics for DataSource and Mesagging are different, so I removed the wip label. Thanks

aljesusg commented 6 years ago

Can you review this or merge if it's ok?

lucasponce commented 6 years ago

It looks good. Just to confirm, now the end user will be able to select a MiddlewareServer or a MiddlewareDatasource when defining an alert from MiQ, isn't it ? Also, one thing to check is to validate that the evaluation process also is aligned, today logic is defined at entity, now if alerts are also defined to datasources as they are different entities I guess that logic will need to be extended as well.

aljesusg commented 6 years ago

It isn't Lucas, the alert profile is defined to assign to a MiddlewareServer no to a MiddlewareDataSource. Do MiddlewareDatasource and MiddlewareMesagging depends of MiddlewareServer?

lucasponce commented 6 years ago

In theory, the MiddlewareServer can be seen as the parent resource of MiddlewareDatasource or MiddlewareMessaging objects. It is ok, it can be defined on MiddlewareServer, but then the process when an event comes and MiQ evaluates that event triggers a MiQ Alert should be revisit, I don't remember if some logic needs to be adapted to support alerts defined in other entities (I may be wrong, and perhaps there is no need to add anything but good to review in the context of this task).

aljesusg commented 6 years ago

Ok, maybe we should analyze this because could be another requirement for UI. @abonas What do you think?

abonas commented 6 years ago

Ok, maybe we should analyze this because could be another requirement for UI. @abonas What do you think?

sure, but this is @jdoyleoss decision

lucasponce commented 6 years ago

I think there is some gap here. The purpose of this feature is to alert on datasources, so, once of the alert is defined, in the alert profile it should be defined the server and the datasource where the alert is applied (unless alerts are created on all available datasources, but I'm not sure if the original request is that). So, for that, some way to define how/where select the datasource is necessary.

abonas commented 6 years ago

@lucasponce I agree. we need clarifications here. just to recap, the definition of the enhancement was (this is only the first part of the definition, the rest is in JIRA) Enable the definition of Alerts in CloudForms based upon the metrics collected from EAP servers. Currently alerts can only be defined on the JVM metrics (heap, non-heap and garbage collection). Control->Explorer->Alerts->Configuration->Add New Alert Based on = Middleware Server What to evaluate -> on JVM stuff appears currently Metrics to add, grouped by priority (Datasource, Web Sessions, Transactions, Messaging) DataSource - Connections Available DataSource - Connections In Use DataSource - Connections Timed Out DataSource - Connection Get Time DataSource - Connection Creation Time DataSource - Connection Wait Time

@jdoyleoss

aljesusg commented 6 years ago

I think that we can merge this If looks good for you. Now Datasource alerts apply to all datasources that they are in the Middleware Server where you assigned the alert profile, maybe we can do a feature to select which Datasources you wanna have this alert.

abonas commented 6 years ago

if from @lucasponce perspective this is good to be merge, I'll merge it. I'm not familiar well with this area myself.

israel-hdez commented 6 years ago

I'm reading some differences in the Jira tasks. JMAN4-201 just lists the metrics. HAWKULAR-1255 clearly state the base entities.

Apart from that, @lucasponce concenrs are true. Remember that "alerts" are templates for Hawkular. So, even if something is correctly created in Hawkular, that's an unbounded trigger. The bounding is done in the profile and, if I'm not wrong, the code (as it is right now) will bound the datasource and messaging alerts to the wrong metric ids. So, those alerts won't work.

I think it's OK to merge, but there is still some work to have datasources and messagings alerts working correctly.

lucasponce commented 6 years ago

As @israel-hdez comments, an MiQ Alert creates a Hawkular Group Trigger, which works like a template. When an MiQ Alert Profile is created, the members of the Group Trigger should be created. So, if the Alert Profile is created against a Middleware Server, then some logic should iterate and fetch all datasources, and its metrics, and it should create a Member Trigger per datasource under the hood. Hence why I asked that perhaps is better to let select datasource to the user. If UI won't change, then from MW Server should fetch and get the metric per all datasources available in the system. Also, when the alert profile changes or is deleted, these members should change or be removed too. Also, in the MiQ side, the logic when this event is evaluated and shown in the timeline should be revisit, as those events will be shown at MW Server (not at MW Datasource). (I think current logic could be ok, but worth to take a look, as here there is a scenario with some differences and might bring some side effect).

miq-bot commented 6 years ago

Checked commit https://github.com/aljesusg/manageiq-providers-hawkular/commit/f58fb2c017b4915bbb9ec2c550cca1724572e206 with ruby 2.3.3, rubocop 0.47.1, and haml-lint 0.20.0 2 files checked, 0 offenses detected Everything looks fine. :cookie:

abonas commented 6 years ago

@miq-bot add_label gaprindashvili/yes

simaishi commented 6 years ago

Gaprindashvili backport details:

$ git log -1
commit 9f47e8a8fff9dfe51b6f799a1f630ab21edf63f7
Author: abonas <mikeyteva@gmail.com>
Date:   Mon Nov 6 12:45:26 2017 +0200

    Merge pull request #88 from aljesusg/issue_alerts

    Fix problem getting live metrics when trigger is created with Datasource and Mesagging types (bugzilla 1505343)
    (cherry picked from commit 9d45bf3aadb09cb324df63aba05aaaf15e75846e)