Closed abarban closed 2 years ago
Sounds good, could you send link(s) to any good existing jboss/wildfly monitoring tool that you know of?
this is solution in Python with Diamond: https://github.com/python-diamond/Diamond/blob/master/src/collectors/jbossapi/jbossapi.py
Official documentation: https://docs.jboss.org/author/display/AS71/The+HTTP+management+API
Hi,
I have done a input plugin for JBoss, just basic support. https://github.com/stefa975/telegraf/tree/master/plugins/inputs/jboss
You can use what you like in that.
Hi @stefa975 and @danielnelson have you plans to add soon this plugin ? I'll be happy if we can use it. Thanks a lot.
Hi, I can make a pull request so the team can have a look at it.
Hi @stefa975 I'm testing it and I could see 2 things to review:
Values are strings not floats as spected Tested on 6.4 EAP and 7.0.2 CE
2017-11-06T10:12:40Z E! Error decoding Float value: errorCount = 0
2017-11-06T10:12:40Z E! Error decoding Float value: requestCount = 0
2017-11-06T10:12:40Z E! Error decoding Float value: maxTime = 0
2017-11-06T10:12:40Z E! Error decoding Float value: processingTime = 0
2017-11-06T10:12:40Z E! Error decoding Float value: bytesReceived = 0
2017-11-06T10:12:40Z E! Error decoding Float value: bytesSent = 0
Could be easily fix with this bit of code in the getWebStatistics method
for key, value := range server.Result {
switch key {
case "bytesReceived", "bytesSent", "requestCount", "errorCount", "maxTime", "processingTime":
if value != nil {
switch value.(type) {
case int:
fields[key] = value.(float64)
case float64:
fields[key] = value.(float64)
case string:
f, err := strconv.ParseFloat(value.(string), 64)
if err != nil {
log.Printf("E! Error decoding Float from string : %s = %s\n", key, value.(string))
} else {
fields[key] = f
}
}
}
}
}
I'm right now trying to adapt the code to get also data from a standalone jboss server. I will report you when I have a fix for this issue.
Hi @stefa975 I've finally added a new config parameter to select mode of jboss execution. ( exec_as_domain = true/false ), I will create a fork with the code if you want to build the PR.
I have another doubt while showing the created measurements.
I can not understand why we will need the "jboss_domain" measurement, as it doesn't give us any important data. You can easily get how many servers inside any host by doing a query over the tags (SHOW TAG VALUES FROM data_db."default".jboss_jvm WITH KEY = "server" WHERE host =~ /$myhost$/). Isn't it?
I will remove this bit of code from my version if you agree
About naming conventions and types
As you can see jboss_database measurement is right now defined as "string" but should be integer instead. Field names should be like the other fields, ( not camel case) .
There is a "type" tag for each measurement , but this is redundant with the measurement name itself, IMHO this tag should be removed.
Hi @toni-moreno, great review, I'll have a look at it.
Hi @stefa975 I've created a new jboss plugin version from your great work.
https://github.com/toni-moreno/telegraf/tree/new_input_jboss_plugin/plugins/inputs/jboss
I've fixed previously commented things , and I've separated jboss_web measurment into jboss_web_con for input connector statistics and jboss_web_app for deployed apps statistics.
Also added a new parameter metrics to select what kind of metrics to get.
## Metric selection
metrics =[
"jvm",
"web_con",
"deployment",
"database",
"jms",
]
I suggest review the code and test it yourself. I will test on some production servers on next days , and perhaps I will do some minor changes, and also will improve measurement documentation.
Hi @stefa975 I've detected a bug in the getJVMStatistics method. In this mode you are supposing always the same Garbage Collection algorithm (the default) with PS_Scavenge and PS_MarkSweep GC counters . But this scheme could change if you change it
You can see a good GC algorithms overview here.
https://plumbr.io/handbook/garbage-collection-algorithms-implementations
I have my jboss with XX:+UseParNewGC -XX:+UseConcMarkSweepGC
and my map has "ParNew" and ConcurrentMarkSweep GC counters.
When executed with this scheme , a crash happens.
2017-11-07T16:27:00Z I! JBoss Plugin Processing Servers from host:[ standalone ] : Server [ standalone ]
panic: interface conversion: interface {} is nil, not map[string]interface {}
goroutine 44 [running]:
github.com/influxdata/telegraf/plugins/inputs/jboss.(*JBoss).getJVMStatistics(0xc4200d80c0, 0x1c9cda0, 0xc420385780, 0xc42012e6c1, 0x2d, 0x1494025, 0xa, 0x1494025, 0xa, 0xc4200180d0, ...)
/home/developer/src/gospace/src/github.com/influxdata/telegraf/plugins/inputs/jboss/jboss.go:823 +0xfc8
github.com/influxdata/telegraf/plugins/inputs/jboss.(*JBoss).getServersOnHost.func1(0xc420143c60, 0xc4200d80c0, 0xc4201e05a0, 0xc42012e6c1, 0x2d, 0x1c9cda0, 0xc420385780, 0x1494025, 0xa)
/home/developer/src/gospace/src/github.com/influxdata/telegraf/plugins/inputs/jboss/jboss.go:489 +0x5b0
created by github.com/influxdata/telegraf/plugins/inputs/jboss.(*JBoss).getServersOnHost
/home/developer/src/gospace/src/github.com/influxdata/telegraf/plugins/inputs/jboss/jboss.go:505 +0x156
I will try to release a fix ASAP...
I've detected another bug, now on tag names.
we have named host the domain host but telegraf uses "host" as tag for os.Hostname(), so we are overwriting a basic telegraf tag .
host should be jboss_host server should be jboss_server
I've already updated with the two previous fixes ! I hope you like how it seems now
Hi @stefa975 I've been working with our jboss new plugin, and perhaps in the future we will need to add more fields in these measurements. But these could be a good starting point. Did you have plans to do a PR ? or perhaps do you prefer to do it myself?
Hi, I think you have done a great refractory and added features I didn't have time to. So you can do a PR and we can add more features later on. I have requests on adding transaction counter from my side, but I can add that later.
Hi @stefa975 I've found something estrange when trying to parse deployment data. ( I can not see any data from my production servers), soy I've been reviewing the code.
You have defined the deployment data structs as
type DeploymentResponse struct {
Outcome string `json:"outcome"`
Result DeploymentMetrics `json:"result"`
}
type DeploymentMetrics struct {
Name string `json:"name"`
RuntimeName string `json:"runtime-name"`
Status string `json:"status"`
Subdeployment map[string]interface{} `json:"subdeployment"`
}
type WebMetrics struct {
ActiveSessions string `json:"active-sessions"`
ContextRoot string `json:"context-root"`
ExpiredSessions string `json:"expired-sessions"`
MaxActiveSessions string `json:"max-active-sessions"`
SessionsCreated string `json:"sessions-created"`
Servlet map[string]interface{} `json:"servlet"`
}
But when getting data from my jboss version 6.4 EAP the API response for one deployment is like
{
"outcome" : "success",
"result" : {
"content" : [{
"path" : "deployments/example.war",
"relative-to" : "jboss.server.base.dir",
"archive" : false
}],
"enabled" : true,
"name" : "example.war",
"owner" : [
{
"subsystem" : "deployment-scanner"
},
{
"scanner" : "default"
}
],
"persistent" : false,
"runtime-name" : "example.war",
"status" : "OK",
"subdeployment" : null,
"subsystem" : {"web" : {
"active-sessions" : 0,
"context-root" : "/example",
"duplicated-session-ids" : 0,
"expired-sessions" : 0,
"max-active-sessions" : 0,
"rejected-sessions" : 0,
"session-avg-alive-time" : 0,
"session-max-alive-time" : 0,
"sessions-created" : 0,
"virtual-host" : "default-host",
"servlet" : {
"migrationdata Servlet" : {
"load-time" : 0,
"maxTime" : 0,
"min-time" : 9223372036854775807,
"processingTime" : 0,
"requestCount" : 0,
"servlet-class" : "com.xxxx.kernel.servlet.PortletServlet",
"servlet-name" : "migrationdata Servlet"
},
"Dynamic Resource Servlet" : {
"load-time" : 0,
"maxTime" : 0,
"min-time" : 9223372036854775807,
"processingTime" : 0,
"requestCount" : 0,
"servlet-class" : "com.xxxx.servlet.PortalClassLoaderServlet",
"servlet-name" : "Dynamic Resource Servlet"
}
}
}}
}
}
As you can see webapp data is under subsystem section and not subdeployment . I think you are looking for subsystem data under subdeployment section , but this not exist, it exist only on the same level ( on my jboss version)
@stefa975 if can you confirm that you have this plugin running and getting deployment data , can you tell me for what jboss version? did you see this problem before?
Hi, yes we have this plugin running, but we only have EAR deployments in our domain. But I can confirm that this is a problem. I have see this too when I tested the plugin with a WAR deployment.
Hi @stefa975 , I've seen before also de same. I've deployed 2 apps ( sample.war and HelloWorl.ear) with these results.
So I've coded a little fix to accept both deployment types. I will commit to my Branch on a few minutes
{
"outcome" : "success",
"result" : {
"content" : [{"hash" : {
"BYTES_VALUE" : "L00a/u5Z7U2/rsIT5BsSijC8Usg="
}}],
"enabled" : true,
"name" : "HelloWorld.ear",
"persistent" : true,
"runtime-name" : "HelloWorld.ear",
"status" : "OK",
"subdeployment" : {
"web.war" : {"subsystem" : {"web" : {
"active-sessions" : 0,
"context-root" : "/HelloWorld",
"duplicated-session-ids" : 0,
"expired-sessions" : 0,
"max-active-sessions" : 0,
"rejected-sessions" : 0,
"session-avg-alive-time" : 0,
"session-max-alive-time" : 0,
"sessions-created" : 0,
"virtual-host" : "default-host",
"servlet" : {"HelloWorldServlet" : {
"load-time" : 0,
"maxTime" : 0,
"min-time" : 9223372036854775807,
"processingTime" : 0,
"requestCount" : 0,
"servlet-class" : "eu.glotzich.j2ee.common.HelloWorldServlet",
"servlet-name" : "HelloWorldServlet"
}}
}}},
"common.jar" : {"subsystem" : null},
"ejb.jar" : {"subsystem" : {"ejb3" : {
"entity-bean" : null,
"message-driven-bean" : null,
"singleton-bean" : null,
"stateful-session-bean" : null,
"stateless-session-bean" : {"MyEJB" : {
"component-class-name" : "MyEJB",
"declared-roles" : [],
"execution-time" : 0,
"invocations" : 0,
"methods" : {},
"peak-concurrent-invocations" : 0,
"pool-available-count" : 20,
"pool-create-count" : 0,
"pool-current-size" : 0,
"pool-max-size" : 20,
"pool-name" : "slsb-strict-max-pool",
"pool-remove-count" : 0,
"run-as-role" : null,
"security-domain" : "other",
"timers" : [],
"wait-time" : 0,
"service" : null
}}
}}}
},
"subsystem" : null
}
}
{
"outcome" : "success",
"result" : {
"content" : [{"hash" : {
"BYTES_VALUE" : "gPUFOxZsadgWl7ohETxnP4NyrKA="
}}],
"enabled" : true,
"name" : "sample.war",
"persistent" : true,
"runtime-name" : "sample.war",
"status" : "OK",
"subdeployment" : null,
"subsystem" : {"web" : {
"active-sessions" : 0,
"context-root" : "/sample",
"duplicated-session-ids" : 0,
"expired-sessions" : 0,
"max-active-sessions" : 0,
"rejected-sessions" : 0,
"session-avg-alive-time" : 0,
"session-max-alive-time" : 0,
"sessions-created" : 0,
"virtual-host" : "default-host",
"servlet" : {"HelloServlet" : {
"load-time" : 0,
"maxTime" : 0,
"min-time" : 9223372036854775807,
"processingTime" : 0,
"requestCount" : 0,
"servlet-class" : "mypackage.Hello",
"servlet-name" : "HelloServlet"
}}
}}
}
}
@stefa975 I've reviewed a little more deployment related influx data.
I've renamed the tag "system" by "runtime_name" as the own json is defined ( system could be anything..), and I'm
I think could be also interesting to add as fields "session-avg-alive-time" and "session-max-alive-time" in the webapp stats, are you agree?
@toni-moreno , yes that's a better name for the tag. I agree, the webapp stats are interesting too.
@toni-moreno I'll test your fix for WAR and EAR as well.
@stefa975 I've just update my fork! I will continue reviewing it. Thank a lot for your great initial work and also for help me.
@toni-moreno are you going to create a pull request for the main telegraf project?
Hi @toni-moreno I cannot monitor JBoss EAP7.1 by INPUT jboss https://github.com/toni-moreno/telegraf/tree/new_input_jboss_plugin/plugins/inputs/jboss
I have monitor JBoss EAP6.4 and EAP7.0 successfully but fail In EAP7.1 version i got this message when i run telegraf testing. (I build telegraf from your git on December 2017)
018-08-30T13:10:46Z I! JBoss Plugin Working as Domain: true 2018-08-30T13:10:46Z E! JBoss Error handling response 1: Response from url "" has status code 401 (Unauthorized), expected 200 (OK) 2018-08-30T13:10:46Z E! JBoss server:http://192.168.56.102:9990/management bodyContent map[operation:read-children-names child-type:host address:[] json.pretty:%!s(int=1)] 2018-08-30T13:10:46Z E! Response from url "" has status code 401 (Unauthorized), expected 200 (OK)
Please suggest me. for fix it.
Thank you
Hi, yes the digest logon fails on eap 7.1. I have a fix for that. I can send a PR or diff on monday.
Hi @stefa975 , Sounds Good. I'm waiting your fix. I can follow your update from this link https://github.com/stefa975/telegraf/tree/master/plugins/inputs/jboss or other link?
Thank you
Hi @stefa975 I can add to my PR (https://github.com/influxdata/telegraf/pull/3537) if you send me the diff .
I've also noticed that new EAP and Wildfly versions has big internal changes, ( mainly in the web and mq modules) I've begun to add support to this new versions but seems that wildfly has bugs on the HTTP management API.
I've opened some weeks ago a related issue (https://github.com/wildfly/wildfly/issues/11309) but nobody has answered yet.
Hi @toni-moreno and @ohcaka , I pushed my changes to my fork. https://github.com/stefa975/telegraf/tree/master/plugins/inputs/jboss has my latest changes.
You can take the best parts from there.
Hi @stefa975 ,
Great Thank. I just test from https://github.com/stefa975/telegraf/tree/master/plugins/inputs/jboss Your fork has fixed "status code 401 (Unauthorized)" issue.
More thing. I want to monitor DataSources statistic.
In EAP6.4, I have monitor list as below. "jboss_jvm" "jboss_database" "jboss_ejb" "jboss_jms" "jboss_web_app" and "jboss_web_con".
In EAP7.0 - EAP7.1, I can monitor "jboss_jvm" only. it cannot monitor DataSources and other.
Please suggest me for fix it.
I think the issue is in the big changes from JBoss AS <7.X ( JBoss EAP 6.X ) vs Wildfly >= 8 ( JBoss EAP 7.X).
I have ready a patch for some components but not working for some others (undertow,active mq) as I told you some days ago.
I will update my PR as soon as I come back home ( now in holidays)
Hi @toni-moreno
"I have ready a patch for some components"
good news for me. I want to monitor DataSources statistic in EAP7.0-7.1. Can you update "DataSources monitoring" component and other to your PR please. I will use it for workaround in my monitoring
Thank you
Hi, I found out that the http digest client I use breaks the login on eap 6. So I'll try to fix that.
Hi @ohcaka, @stefa975 , I've updated the PR with wildfly support , ( where HTTP API is working ok). (https://github.com/influxdata/telegraf/pull/3537/commits/48af54039f405eee8996f6026c7fcd435a154683)
Could you compile and test again ?
@toni-moreno Wow I'll try your PR. Thank you.
Hi @toni-moreno I just complied and tested your PR but it cannot authentication. please suggest me for fix it.
[root@slavehost2 telegraf]# vi etc/telegraf.conf [root@slavehost2 telegraf]# ./telegraf --config etc/telegraf.conf --test 2018-09-14T05:04:35Z E! JBoss Error handling ExecMode Test: Response from url "" has status code 401 (Unauthorized), expected 200 (OK) 2018-09-14T05:04:35Z E! Error in plugin [inputs.jboss]: Response from url "" has status code 401 (Unauthorized), expected 200 (OK) 2018-09-14T05:04:35Z E! Error in plugin [inputs.jboss]: Error decoding JSON response (ExecTypeResponse) ,unexpected end of JSON input 2018-09-14T05:04:35Z I! Get Servers from host: standalone 2018-09-14T05:04:35Z I! JBoss Plugin Processing Servers from host:[ standalone ] : Server [ standalone ] [root@slavehost2 telegraf]#
I noticed that my jboss eap7.1 process is running in Managed Domain Mode but in testing output it show standalone mode as below.
14T05:04:35Z I! JBoss Plugin Processing Servers from host:[ standalone ] : Server [ standalone ] [root@slavehost2 telegraf]#
My telegraf configuration
# Read flattened metrics from one or more JBoss HTTP endpoints [[inputs.jboss]] servers = [ "http://172.10.2.12:9990/management", ] ## Execution Mode # exec_as_domain = true <<<< In this test. i don't use it. because it error when run test ## Username and password username = "jbossadm" password = "password" ## Metric selection metrics =[ "jvm", "web_con", "deployment", "database", "jms", ]
My Starting Jboss command
#jboss-eap-7.1/bin/domain.sh --host-config=host.xml -b 172.10.2.12 -bmanagement 172.10.2.12
I cannot use "exec_as_domain" parameter in your PR.
[root@slavehost2 telegraf]# ./telegraf --config etc/telegraf.conf --test 2018/09/14 14:38:56 E! Error parsing etc/telegraf.conf, line 3575: field corresponding to
exec_as_domain' is not defined in
*jboss.JBoss'
About Unauthenticated issue I have tested @stefa975 's PR. (https://github.com/stefa975/telegraf/tree/master/plugins/inputs/jboss) it's solved Difference thing is exec_as_domain using. @stefa975 's PR will use exec_as_domain but your PR don't use. I don't know How to use it? please suggest me to fix it.
Thank you so much @toni-moreno .
Any update on this?
@toni-moreno I have the same errors
2019-01-18T16:03:00Z I! Get Servers from host: standalone
2019-01-18T16:03:00Z I! JBoss Plugin Processing Servers from host:[ standalone ] : Server [ standalone ]
2019-01-18T16:03:00Z E! JBoss Deployment Error handling response 3: Response from url "" has status code 500 (Internal Server Error), expected 200 (OK)
@stefa975 from your fork
panic: interface conversion: interface {} is nil, not string
goroutine 42 [running]:
github.com/influxdata/telegraf/plugins/inputs/jboss.(*JBoss).Gather(0xc0001a6690, 0x1b79e20, 0xc000166f60, 0x2540be400, 0xc000134360)
/root/go/src/github.com/influxdata/telegraf/plugins/inputs/jboss/jboss.go:287 +0x8be
github.com/influxdata/telegraf/agent.gatherWithTimeout.func1(0xc000048420, 0xc000142c00, 0x1b79e20, 0xc000166f60)
/root/go/src/github.com/influxdata/telegraf/agent/agent.go:153 +0x47
created by github.com/influxdata/telegraf/agent.gatherWithTimeout
/root/go/src/github.com/influxdata/telegraf/agent/agent.go:152 +0xd1
Hi, are user /passwd correct?
@golak what versión (commit) are you running? could you upload log files when you config in debug mode ?
@toni-moreno I have the same errors
2019-01-18T16:03:00Z I! Get Servers from host: standalone 2019-01-18T16:03:00Z I! JBoss Plugin Processing Servers from host:[ standalone ] : Server [ standalone ] 2019-01-18T16:03:00Z E! JBoss Deployment Error handling response 3: Response from url "" has status code 500 (Internal Server Error), expected 200 (OK)
Hi @stefa975 I've created a new jboss plugin version from your great work.
https://github.com/toni-moreno/telegraf/tree/new_input_jboss_plugin/plugins/inputs/jboss
I've fixed previously commented things , and I've separated jboss_web measurment into jboss_web_con for input connector statistics and jboss_web_app for deployed apps statistics.
Also added a new parameter metrics to select what kind of metrics to get.
## Metric selection metrics =[ "jvm", "web_con", "deployment", "database", "jms", ]
I suggest review the code and test it yourself. I will test on some production servers on next days , and perhaps I will do some minor changes, and also will improve measurement documentation.
Hi @toni-moreno ,
I have tried with your porivided jboss inputs plugin for telegraf. But I am facing below error :
[root@brqh telegraf]# tail -f telegraf.log 2020-07-10T12:16:03Z I! Starting Telegraf 1.14.3 2020-07-10T12:16:03Z I! Loaded inputs: cpu disk system jolokia2_agent linux_sysctl_fs conntrack jolokia2_proxy diskio kernel interrupts net netstat mem processes swap nstat 2020-07-10T12:16:03Z I! Loaded aggregators: 2020-07-10T12:16:03Z I! Loaded processors: 2020-07-10T12:16:03Z I! Loaded outputs: influxdb 2020-07-10T12:16:03Z I! Tags enabled: host=brqh.nat.myrio.net 2020-07-10T12:16:03Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"brqh.nat.myrio.net", Flush Interval:10s 2020-07-10T12:26:36Z I! [agent] Hang on, flushing any cached metrics before shutdown 2020-07-10T12:26:40Z I! Starting Telegraf 1.14.3 2020-07-10T12:26:40Z E! [telegraf] Error running agent: Error parsing /etc/telegraf/telegraf.conf, Undefined but requested input: jboss
Please help me out.
Anyone please help me for this issue which I am facing.
Hi @stefa975 I've created a new jboss plugin version from your great work. https://github.com/toni-moreno/telegraf/tree/new_input_jboss_plugin/plugins/inputs/jboss I've fixed previously commented things , and I've separated jboss_web measurment into jboss_web_con for input connector statistics and jboss_web_app for deployed apps statistics. Also added a new parameter metrics to select what kind of metrics to get.
## Metric selection metrics =[ "jvm", "web_con", "deployment", "database", "jms", ]
I suggest review the code and test it yourself. I will test on some production servers on next days , and perhaps I will do some minor changes, and also will improve measurement documentation.
Hi @toni-moreno ,
I have tried with your porivided jboss inputs plugin for telegraf. But I am facing below error :
[root@brqh telegraf]# tail -f telegraf.log 2020-07-10T12:16:03Z I! Starting Telegraf 1.14.3 2020-07-10T12:16:03Z I! Loaded inputs: cpu disk system jolokia2_agent linux_sysctl_fs conntrack jolokia2_proxy diskio kernel interrupts net netstat mem processes swap nstat 2020-07-10T12:16:03Z I! Loaded aggregators: 2020-07-10T12:16:03Z I! Loaded processors: 2020-07-10T12:16:03Z I! Loaded outputs: influxdb 2020-07-10T12:16:03Z I! Tags enabled: host=brqh.nat.myrio.net 2020-07-10T12:16:03Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"brqh.nat.myrio.net", Flush Interval:10s 2020-07-10T12:26:36Z I! [agent] Hang on, flushing any cached metrics before shutdown 2020-07-10T12:26:40Z I! Starting Telegraf 1.14.3 2020-07-10T12:26:40Z E! [telegraf] Error running agent: Error parsing /etc/telegraf/telegraf.conf, Undefined but requested input: jboss
Please help me out.
Anyone please help me for this issue which I am facing.......
If anyone is interested in taking over @toni-moreno's JBoss input plugin PR #3537 please feel free to do so.
As this plugin is rather large so far I'd suggest submitting it as an external plugin. Telegraf users would be able to use the plugin sooner by having it as an external plugin since it wouldn't have to go through the typical review process.
Going to close this due to inactivity. While there was one attempt to previously deliver this plugin, there's no recent activity on it. If anyone is interested in picking that up or starting fresh, do continue the conversation, and we can re-open the issue.
I'm so interesting on TICK platform! Working on enterprise and open platform, will be so interesting monitoring application stack objects. One of this is Jboss/Wildfly. Just try with httpjson plugin, but think that a dedicated input plugin like on Diamond will be more powerful.
I am also available for development and testing.