Instrumental / instrumentald

Instrumental System and Service Daemon
MIT License
14 stars 3 forks source link

CentOS 7 #79

Closed danner26 closed 6 years ago

danner26 commented 6 years ago

I am running this to track NGINX and MySQL.

I get the following errors: instrumentald version 1.1.0 started at 2018-04-04 20:21:27 UTC Collecting stats under the hostname: ip-172-31-80-89.ec2.internal 2018-04-04T20:21:27Z I! Starting Telegraf (version 1.2.1) 2018-04-04T20:21:27Z I! Loaded outputs: instrumental 2018-04-04T20:21:27Z I! Loaded inputs: inputs.disk inputs.mem inputs.swap inputs.system inputs.net inputs.nginx inputs.cpu 2018-04-04T20:21:27Z I! Tags enabled: host=ip-172-31-80-89.ec2.internal 2018-04-04T20:21:27Z I! Agent Config: Interval:30s, Quiet:false, Hostname:"ip-172-31-80-89.ec2.internal", Flush Interval:30s telegraf execution failed, 1 total failures telegraf execution failed, 1 total failures

and my config file is: project_token = "REDACTED"

system = ["cpu", "disk", "load", "memory", "network", "swap"] mysql = ["root@tcp(localhost:3306)/"] nginx = ["URL REDACTED"]

Any ideas what is happening? Why does it need telegraf, and why doesnt the install guide say that I need to install telegraf?

I am assuming it installs it via the install script but since im using an HVM instance which does not have telegraf in the repo, that is the issue. Is there a list of dependant software I should install for CentOS?

mediocretes commented 6 years ago

Instrumentald uses telegraf to connect to mysql and nginx. It includes its own very specific version of telegraf in the package.

I added CentOS 7.2 to our local tests and everything is working fine. Our test environment installs the package from packagecloud via chef - can you give us more information about how you are installing instrumentald?

Also, let's run through the likely suspects:

It doesn't seem like you're on a 32 bit OS, based on the errors, but instrumentald only supports 64 bit at this time. Are you sure you're on 64 bit CentOS?

Can you try running the packaged telegraf and verify that it starts (but fails due to the lack of a config file)? In our test setup, that looks like this:

[vagrant@default-centos-72 ~]$ /opt/instrumentald/lib/app/lib/telegraf/amd64/telegraf --test
2018/04/04 22:41:16 E! No config file specified, and could not find one in $TELEGRAF_CONFIG_PATH, /home/vagrant/.telegraf/telegraf.conf, or /etc/telegraf/telegraf.conf

Also, can you tell us more about your OS? In our test setup, that looks like this:

[vagrant@default-centos-72~]$ uname -a
Linux default-centos-72.vagrantup.com 3.10.0-327.el7.x86_64 #1 SMP Thu Nov 19 22:10:57 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

If you'd like to move this to a support ticket within instrumental, just email support@instrumentalapp.com and include a link to this issue so we know it's you. Thanks!

danner26 commented 6 years ago

I think the confusion is that I am using HVM, so it is stripped down to run in the cloud. When I run the uname I get [centos@ip-172-31-80-89 ~]$ uname -a Linux ip-172-31-80-89.ec2.internal 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

The following might help you a bit more: [root@ip-172-31-80-89 ~]# cat /etc/centos-release CentOS Linux release 7.4.1708 (Core)

Also, when running telegraf I get the same no config error when you run the telegraf test.

danner26 commented 6 years ago

Reinstalling the software causes the same errors: ``

Now when I remove the MySQL part it errors out in the same spot, but when i have the mysql config in there I get an error that password was not defined.. How do I pass a password to instrumental? I setup a service account but I can't seem to get the password to work.

2018-04-05T12:42:55Z I! Starting Telegraf (version 1.2.1)
2018-04-05T12:42:55Z I! Loaded outputs: instrumental
2018-04-05T12:42:55Z I! Loaded inputs: inputs.system inputs.mysql inputs.net inputs.nginx inputs.cpu inputs.disk inputs.mem inputs.swap
2018-04-05T12:42:55Z I! Tags enabled: host=ip-172-31-80-89.ec2.internal
2018-04-05T12:42:55Z I! Agent Config: Interval:30s, Quiet:false, Hostname:"ip-172-31-80-89.ec2.internal", Flush Interval:30s
2018-04-05T12:43:00Z E! ERROR in input [inputs.mysql]: Errors encountered: [Error 1045: Access denied for user 'inst_serv@'@'localhost' (using password: NO)]
2018-04-05T12:43:00Z E! ERROR in input [inputs.mysql]: Errors encountered: [Error 1045: Access denied for user 'inst_serv@'@'localhost' (using password: NO)]
telegraf execution failed, 1 total failures
2018-04-05T12:43:30Z E! ERROR in input [inputs.mysql]: Errors encountered: [Error 1045: Access denied for user 'inst_serv@'@'localhost' (using password: NO)]
2018-04-05T12:44:00Z E! ERROR in input [inputs.mysql]: Errors encountered: [Error 1045: Access denied for user 'inst_serv@'@'localhost' (using password: NO)]
2018-04-05T12:44:30Z E! ERROR in input [inputs.mysql]: Errors encountered: [Error 1045: Access denied for user 'inst_serv@'@'localhost' (using password: NO)]
2018-04-05T12:45:00Z E! ERROR in input [inputs.mysql]: Errors encountered: [Error 1045: Access denied for user 'inst_serv@'@'localhost' (using password: NO)]
2018-04-05T12:45:30Z E! ERROR in input [inputs.mysql]: Errors encountered: [Error 1045: Access denied for user 'inst_serv@'@'localhost' (using password: NO)]
2018-04-05T12:46:00Z E! ERROR in input [inputs.mysql]: Errors encountered: [Error 1045: Access denied for user 'inst_serv@'@'localhost' (using password: NO)]
2018-04-05T12:46:30Z E! ERROR in input [inputs.mysql]: Errors encountered: [Error 1045: Access denied for user 'inst_serv@'@'localhost' (using password: NO)]
danner26 commented 6 years ago

Alright I think I figured out what is going on, I noticed even though I removed the mysql from the config I am still getting:

instrumentald version 1.1.0 started at 2018-04-05 12:50:45 UTC
Collecting stats under the hostname: ip-172-31-80-89.ec2.internal
2018-04-05T12:50:45Z I! Starting Telegraf (version 1.2.1)
2018-04-05T12:50:45Z I! Loaded outputs: instrumental
2018-04-05T12:50:45Z I! Loaded inputs: inputs.disk inputs.mem inputs.swap inputs.system inputs.net inputs.nginx inputs.cpu
2018-04-05T12:50:45Z I! Tags enabled: host=ip-172-31-80-89.ec2.internal
2018-04-05T12:50:45Z I! Agent Config: Interval:30s, Quiet:false, Hostname:"ip-172-31-80-89.ec2.internal", Flush Interval:30s
2018-04-05T12:51:00Z E! ERROR in input [inputs.mysql]: Errors encountered: [Error 1045: Access denied for user 'root@'@'localhost' (using password: NO)]
2018-04-05T12:51:30Z E! ERROR in input [inputs.mysql]: Errors encountered: [Error 1045: Access denied for user 'root@'@'localhost' (using password: NO)]

Which makes me believe that it is reading a confing from somewhere else, so I checked /opt/instrumentald/lib/app/conf/instrumentald.toml and made that file a symlink to the /etc/instrumentald.toml file but that still did not work. I have no idea why it is still running the mysql bit even though I didnt specify that. Here is the config:

project_token = "TOKEN REDACTED"
system = ["cpu", "disk", "load", "memory", "network", "swap"]
nginx = ["URL REDACTED"]

It seems pretty straight forward that I just want system and nginx stats. Not sure why it is running mysql checks still.

danner26 commented 6 years ago

Im sorry for all of the comments, I just want to provide as much info as possible. So when I run /opt/instrumentald/instrumentald --debug I get:

[root@ip-172-31-80-89 instrumentald]# ./instrumentald --debug
instrumentald version 1.1.0 started at 2018-04-05 12:58:09 UTC
Collecting stats under the hostname: ip-172-31-80-89.ec2.internal
starting metrics collector
telegraf binary: /opt/instrumentald/lib/app/lib/telegraf/amd64/telegraf
telegraf config: /tmp/instrumentald_telegraf.toml
2018-04-05T12:58:09Z D! Attempting connection to output: instrumental
2018-04-05T12:58:09Z D! Successfully connected to output: instrumental
2018-04-05T12:58:09Z D! Attempting connection to output: file
2018-04-05T12:58:09Z D! Successfully connected to output: file
2018-04-05T12:58:09Z I! Starting Telegraf (version 1.2.1)
2018-04-05T12:58:09Z I! Loaded outputs: instrumental file
2018-04-05T12:58:09Z I! Loaded inputs: inputs.disk inputs.mem inputs.swap inputs.system inputs.net inputs.nginx inputs.cpu
2018-04-05T12:58:09Z I! Tags enabled: host=ip-172-31-80-89.ec2.internal
2018-04-05T12:58:09Z I! Agent Config: Interval:30s, Quiet:false, Hostname:"ip-172-31-80-89.ec2.internal", Flush Interval:30s
Sent 0 metrics
2018-04-05T12:59:00Z D! Output [file] buffer fullness: 15 / 10000 metrics.
system,path=/,device=rootfs,system_measurement_tag=disk,host=ip-172-31-80-89.ec2.internal total=26831990784i,free=24089645056i,used=2742345728i,used_percent=10.220433325563265 1522933110000000000
system,system_measurement_tag=disk,host=ip-172-31-80-89.ec2.internal,path=/,device=xvda1 total=26831990784i,free=24089645056i,used=2742345728i,used_percent=10.220433325563265 1522933110000000000
system,system_measurement_tag=memory,host=ip-172-31-80-89.ec2.internal available=233660416i,buffered=0i,available_percent=22.49119805075758,active=582815744i,inactive=183549952i,used_percent=77.50880194924243,total=1038897152i,used=805236736i,free=95371264i,cached=301617152i 1522933110000000000
system,system_measurement_tag=swap,host=ip-172-31-80-89.ec2.internal used=0i,free=0i 1522933110000000000
system,host=ip-172-31-80-89.ec2.internal,system_measurement_tag=load load15=0.05,load1=0.01,load5=0.03 1522933110000000000
system,interface=eth0,system_measurement_tag=network,host=ip-172-31-80-89.ec2.internal packets_sent=203080i,packets_recv=583680i,err_in=0i,err_out=0i,drop_in=0i,drop_out=0i,bytes_sent=184881464i,bytes_recv=714067755i 1522933110000000000
nginx,server=REDACTED requests=3100i,reading=0i,writing=2i,waiting=2i,active=4i,accepts=1549i,handled=1549i 1522933110000000000
system,host=ip-172-31-80-89.ec2.internal,path=/,device=rootfs,system_measurement_tag=disk total=26831990784i,free=24089645056i,used=2742345728i,used_percent=10.220433325563265 1522933140000000000
system,path=/,device=xvda1,system_measurement_tag=disk,host=ip-172-31-80-89.ec2.internal used=2742345728i,used_percent=10.220433325563265,total=26831990784i,free=24089645056i 1522933140000000000
system,system_measurement_tag=cpu,host=ip-172-31-80-89.ec2.internal usage_nice=0,usage_softirq=0,usage_guest_nice=0,usage_iowait=0.033388981636096646,usage_irq=0,usage_steal=0,usage_guest=0,usage_user=0.5676126878128364,usage_system=0.1335559265442917,usage_idle=99.26544240400236 1522933140000000000
system,system_measurement_tag=load,host=ip-172-31-80-89.ec2.internal load1=0.07,load5=0.04,load15=0.05 1522933140000000000
system,system_measurement_tag=memory,host=ip-172-31-80-89.ec2.internal available_percent=22.207722059478705,total=1038897152i,available=230715392i,used=808181760i,free=92393472i,used_percent=77.7922779405213,cached=301649920i,buffered=0i,active=585564160i,inactive=183582720i 1522933140000000000
system,system_measurement_tag=swap,host=ip-172-31-80-89.ec2.internal used=0i,free=0i 1522933140000000000
system,interface=eth0,system_measurement_tag=network,host=ip-172-31-80-89.ec2.internal bytes_sent=184898843i,bytes_recv=714078965i,packets_sent=203130i,packets_recv=583727i,err_in=0i,err_out=0i,drop_in=0i,drop_out=0i 1522933140000000000
nginx,server=REDACTED active=4i,accepts=1550i,handled=1550i,requests=3103i,reading=0i,writing=2i,waiting=2i 1522933140000000000
2018-04-05T12:59:00Z D! Output [file] wrote batch of 15 metrics in 66.675µs
2018-04-05T12:59:00Z D! Output [instrumental] buffer fullness: 15 / 10000 metrics.
2018-04-05T12:59:00Z D! Output [instrumental] wrote batch of 15 metrics in 1.102034ms
Sent 0 metrics

So it looks like it is collecting the metrics but not sending them? Not sure why the log file keeps getting mysql errors even though running this doesnt appear to have anything with mysql running.

danner26 commented 6 years ago

Alright so I managed to get the nginx/server metrics to send.. still having the mysql issue: could you expand more on how I should configure mysql/instrumentald to monitor?

Refer to my third comment about the mysql connection erroring out because no password is specified.

mediocretes commented 6 years ago

Thanks for the detailed information, this is exactly what we need. We'll get this sorted out.

For mysql, the password is part of the connection string, like this:

username:password@tcp(your.database.hostname.here:3306)/

Try that and let me know if you're still having issues. If you want to share data you'd rather not be in public GitHub, you can open an issue by clicking the question mark icon at the top right of Instrumental.

Looking at that last log, I see this:

Loaded inputs: inputs.disk inputs.mem inputs.swap inputs.system inputs.net inputs.nginx inputs.cpu

Which indicates that it did NOT load the mysql instrumentation. Could instrumentald also be running as a service and muddying the log with additional output based on an older config? The 'Sent 0 metrics' line doesn't worry much while running with --debug, that is probably from the plugin script executor and all the built-in instrumentation looks like it's still reporting.

danner26 commented 6 years ago

I dont have an issue sharing the logs, and no one answered on the site so I opened this. Either way, I got all of that to work except for the mysql part. The password for the account has characters in it, how should I escape them? It is saying the password is incorrect when I do: mysql = ["root:'sample_pass&-!'@@tcp(localhost:3306)/"]

I have tried it with no quotes either. Is there a workaround for passwords with special characters?

And to clarify, that password does work.

danner26 commented 6 years ago

There is an extra @ in the string, everything is working as expected now.. Thank you!

mediocretes commented 6 years ago

I see your ticket in support - I just saw this before I saw that and claimed it. Don't worry, that button does work.

Internally, it uses the go driver's DSN parser for getting passwords. For example, I created a user with a password of test-!@ and this worked for them: mysql = ["tester:test-!@@tcp(127.0.0.1:3306)/"]

No quotes, no escaping. I'd be willing to bet that backslash and # would cause problems that I wouldn't have a great answer for, but -, @, and ! should be fine.

mediocretes commented 6 years ago

Oh, just saw your latest comment, glad it's working, let us know here or in the app if you have other issues!