devonfw / devon4j

devonfw Java stack - create enterprise-grade business apps in Java safe and fast
Apache License 2.0
83 stars 87 forks source link

Consider imporvements for monitoring #44

Open hohwille opened 5 years ago

hohwille commented 5 years ago

For monitoring of a devonfw documentation there should be more guidance and features.

hohwille commented 5 years ago

See also here: https://github.com/devonfw/devon4j/wiki/guide-apm (considering JavaMeldoy)

sjimenez77 commented 5 years ago

Sometime ago we already created a demo and a cookbook entry in the devonfw guide for the integration of Spring Boot Admin https://github.com/devonfw/devon/wiki/Spring-boot-admin-Integration-with-devon4j. The document is probably deprecated, but could be a starting point.

My point here is that we should save the still valid cookbook entries for the different stacks wikis before removing the devonfw guide as it is today.

nricheton commented 5 years ago

Hello,

Following a discussion w/ Jörg & Santos, here is my input on monitoring.

Overview :

Some examples of questions that should have an immediate answer from a monitoring solution. (Immediate = display a web page, now)

All these answers are priceless in production, but even in development/testing environments, where they are a clear indicator of the upcoming issues in the next stage.

In several projects, we have made huge improvements in quality and efficiency by having and looking at these metrics every day. Even non technical people can point out the code that is causing issues and the impacted features.

Several tools exists to set up this kind of monitoring.

I really think that devon should provide tooling out of the box and ready-to-use accelerators to provide additional analysis value for commons problems.

One effort to have this kind of monitoring have been appstatus : https://github.com/appstatus/appstatus http://appstatus.sourceforge.net

Used by many projects in different IT companies.

Other alternatives : https://github.com/javamelody/javamelody https://www.appdynamics.fr/java/ https://www.zabbix.com/features

Again, low level metrics have little value, we need interpreted metrics, with business level (operations, rules, data retrieval, user perceived response time, ...) available from developper env. to production env. (And this probably should NOT be an option when creating a new devon application :-) )

Feedback is welcome !

Nicolas

hohwille commented 5 years ago

@nricheton thanks for your wunderful input. I added AppDynamics and Zabbix to our guide: https://github.com/devonfw-wiki/devon4j/wiki/guide-apm

Also we will have a look at appstatus.

However, we have to be careful with what we integrate by default. In one of my customer projects we used to integrate JavaMelody into all apps and then there came some CVE vulnerabilities with it and we were forced to remove it. Maybe the issues are meanwhile resolved. However, we should investigate your requirements and find a perfect match what we want to integrate as first choice and bring out of the box and what to have a just an option for projects that need more.

Being able to report the release version is of course very simple and does not come with any risk. Also health status (e.g. with spring actuator) should come OOTB. For monitoring OS level stuff there are tons of solutions already out there and they should IMHO not be build into the app itself (we do not need a Java solution to observe CPU, Memory or Disk). Also there is already SNMP as an established protocol. In this sense we should IMHO also think of complex IT landscapes and microservices. Hence, an app does not really need to ship a UI for monitoring. Assume you have multiple redundant nodes of an app in a cluster with loadbalancing. What use would it make to view a UI in the browser showing CPU usage of the current app itself if I get assigned to some node randomly via some loadbalancer and have no direct access to the node itself? So instead we need to provide services that offer the monitoring data and look for state-of-the-art monitoring systems that integrate with all apps and all their nodes of the entire IT landscape presenting a complete dashboard and triggering alarms if something goes wrong.

Another aspect is OWASP Sensitive Data Exposure. Therefore detailed monitoring data should not be available to the outside world (end-users, internet) but stay secret within the admin-plane. In this manner we should also define strict standards for e.g. URL path scheme for monitoring services to simplify and avoid complex individual configurations.

nricheton commented 5 years ago

Hi @hohwille

Thanks for your feedback !

On CVE risk, I would say that all Devon components (and all projects in general) have CVE in their history. Apart from projects which does not fix important CVEs for a long time, we should not consider CVE declaration as a reason of not integrating valuable components.

On OS-level monitoring, I fully agree with you that dedicated, existing solutions should be used. However, a first level of checks can be integrated in solutions, here are some reasons :

On the data availability : I agree data should not be available to public, internet users. This should be reserved to people responsables of operations, like any monitoring tool.

Web page in module are mostly for early stages of feature development, then data should be aggregated into a common monitoring interface (any solution).

I would be happy to show you next week how appstatus handles these ideas, and how it allows to export the data for proper aggregation. And discuss of real world examples !

Nicolas

hohwille commented 5 years ago

I fully support making progress in this area. Also I assume we will spend a slot on the next DA meeting discussing this. However, as we broadened the scope of this issues and some aspects are not yet completely clear, I removed the milestone. Otherwise we would block the release planned for next month. If people come up with PRs to solve this issue, I am more than happy to replan it for 3.1.0 but at the moment I can not see how I could solve it till then...

hohwille commented 5 years ago

@nricheton thanks for your feedback. I do agree that having some additional features like Memory or disc-space are great to have if they come without big effort or without complex dependencies. May only concern was that we should not waste our time to scan all mounted devices and observe their disc-space, send alerts, etc. inside Java if there are already tons of OS level tools doing all this. To be more pragmatic, I would like to start with spring-boot-actuator and maybe also spring-boot-admin. Then we collect the list of features we get with them and see what are the remaining gaps, choose additional tools and move on till we have covered what we think is crutial.

hohwille commented 5 years ago

Do we have some key person who could drive the development of this issue. IMHO this is not just a 1-2 hours tasks but will need some attention and continuity. I do not have the time at the moment but would love to see some action and avoid that we are just talking. I am still happy to assist and support this also with some code snipplets or reviews...

hohwille commented 3 years ago

So JavaMelody even has a spring-boot-starter so you may only need to add a dependency and you are done. Also glowroot can be added in a similar easy way. Then there are solutions like spring actuator to provide app specific sensors to be integrated with existing monitoring tools such as CheckMk/Icinga/Nagios/etc.

So is there anybody left who initialally raised demands for this toppic - maybe @nricheton ? What is left to do and the way to go?

As a learning we should go away from such generic issues - either the issue should be cristal clear in what is to do or we need a real driver who actively works on that.